A Conversation With Amazon's CTO: Stories, Insights, and Secrets From Inside Amazon

This article is machine translated
Show original
If you want to be a fast-growing company, you can't be like a traditional business.

Compilation: Deep Tide TechFlow

Note: This article is included in the Shenchao TechFlow topic "YC Entrepreneurship Course Chinese Notes" (updated daily, this article is the last article in the series.), dedicated to collecting and sorting out the Chinese version of the YC course, and the twenty-fifth article is Amazon's chief CTO Werner Vogels' online course "Experiences and Insights into Technology and Startups".

before joining amazon

Before joining Amazon, I was an academic who spent ten years doing research science at Cornell University and building large-scale distributed systems. Before that, I wasn't a typical computer scientist. It was not until the age of 28 that he made a real decision to go back to school for further studies. Previously, I worked in radiotherapy in hospitals, providing radiotherapy to cancer patients at the Netherlands Cancer Institute. One day it dawned on me that I couldn't bear the death of people around me, so I decided to do something that had nothing to do with it. Computer science seemed like a good choice.

It was the mid-80s, and computer science wasn't as popular as it is now. However, it turned out that I had a talent that I didn't know at the time. So I started digging deeper, because that's what I was really interested in. I got my Ph.D., worked for a few years at a research facility in Portugal, and then was invited to join Cornell.

During my time at Cornell, in addition to my research work, I often consulted with large companies like HP, and others that I couldn't hear clearly, and participated in various conferences. Once, Amazon invited me to present some material I was working on. I was a little surprised and confused at first: really? Do I need to do this?

How tricky were web browsing and databases back then? However, when I started to get started, I realized that this is actually a huge technical challenge. Amazon isn't just a retailer, it's a technology company operating on a scale I've never seen before and definitely not on par with any other company I've consulted with. From a distributed systems research perspective, the challenges they face are staggering.

When Amazon offered me a job offer, I took it without hesitation.

Amazon's scale operation and technology leadership

I think most distributed researchers have recognized at what scale these large companies need to operate, and not even limited to these large companies. Whether an Internet company or a digital company, to be successful, it needs to operate at enormous scale.

Looking back in 2004, when I joined Amazon, many people probably found it relatively easy to operate at the scale that Amazon was at that time. But that doesn't mean you can rely on a position or essential infrastructure. Therefore, a lot of work has been done on cloud technology and other technologies to ensure that the advantages they provide can be fully utilized.

Amazon reached a certain size in 2004 only by doing, and there are no books or guides that can clarify how to build a scalable organization or company. So I think Amazon is ahead of the curve, five to 10 years ahead, in terms of technology adoption, technology evolution, and scale of operations. This is especially important for companies pursuing rapid growth.

If you want to be a fast-growing company, you can't be like a traditional business. Traditional businesses often face the innovator's dilemma, where once something succeeds, it becomes very slow.

How to build a company that continues to grow rapidly is a whole different story and you have to weigh the business carefully. For example, whether creating technical debt or tolerating some recurring things, this is not feasible in traditional enterprises, because efficiency is their main goal.

At Amazon, being fast, innovating fast, and having a long pipeline of experiments is what matters. So you're willing to tolerate some recurring things, allowing some technical debt to incur, as long as you know you have to pay it off.

So it's hard to find these kind of compromises that Amazon is willing to make in traditional MBA books. For the most part, Amazon had to develop technology, processes, and business processes on its own. Of course, with a visionary leader like Jeff Bezos, who truly understands what the future will look like and what the modern world should look like.

key to growth

While Amazon has achieved great success at scale, we still face the challenge of achieving greater growth. To move to the next stage of growth, we need to think and act more critically.

An example is performance issues. How to measure performance? What kind of infrastructure do we need to take measurements? To be a truly data-driven company and make data-driven decisions starts with owning the data and building a culture around how those measurements are evaluated and interpreted.

Even a slight delay of 1.2 seconds in page load time can have a negative impact on customer experience. It just goes to show that 50% of the customer experiences get worse, and you need to understand how bad it is. Controls like 99% or 99.9% also become more important from an engineering perspective. Then, you need to build a mechanism that can really take 99% of the engineering discipline and connect it to business decisions.

I think in 2004, our reliability was pretty high. There are some rules that bind us, like we have to use a data center in a specific area (SEA). Whatever you do to these SEA data centers needs to be replicated in other data centers so that if one data center goes down, customers are not affected.

While customers may experience delays, functionality will not be affected. We were pretty good at dealing with all these rules, until one day we decided to disconnect one of the data centers and see what happened. With just a little tweaking of the network, one data center can be isolated from the others.

In reality, however, all of this stuff that looks so good on paper doesn't work as well in practice as one might think. We've been doing rehearsals throughout the year, although there were still a lot of manual processes on the first try, such as manual database failover. And by the time you get to the third or fourth run, you've actually reached the point where you can almost automate the run without human intervention. This is critical to ensure high availability and fault tolerance of the system.

In our efforts, we also focus on developing data analytics and insights. Amazon has a lot of data, but how to get valuable insights from it is a challenge. We are dedicated to building powerful data analysis tools and models that help us understand customer behavior, market trends and business opportunities. This data-driven approach enables us to make better decisions and provide personalized products and services.

In addition, we are also working on improving the user experience. We have thoroughly studied the needs and preferences of users during the shopping process, and improved user experience through design optimization, interface improvement and personalized recommendations. We pursue a simple, intuitive and seamless shopping experience to meet our customers' expectations and earn their loyalty.

The changing role of the CTO

Your role as CTO changes when you become a technology provider.

I touched on this issue in a blog post. In my opinion, a CTO has four different types of responsibilities:

  • The first is the enterprise-level CTO, who is usually responsible for infrastructure management, reports to the CIO, and is responsible for managing a large amount of infrastructure.
  • The second type is the technical co-founder CTO, which is common among young companies, and they have technical vision. But I think there is some risk in this role, because many other things, such as the management of engineering teams, etc., will be included in the remit of this role. The CTO may not necessarily be good at handling these matters, but we'll discuss that in more detail later.
  • The third type is the CTO who is a big thinker, and they drive the development of innovation. Companies like AT&T and Lucent, for example, have a CTO or a CTO office dedicated to researching and experimenting with next-generation technologies.
  • The final type is the externally facing technologist CTO, who is responsible for deep technical interactions with customers, understanding how customers use their products, and looking for deeper, broader patterns and Customer pain points. This role not only focuses on its own technology, but also pays more attention to the overall situation.

It's important to note that these roles are more customer-centric than just technology-centric. It’s important to bring customer feedback back to the company and think about what new features or products need to be developed, or what processes need to be changed to better serve customers.

Therefore, the CTO role as a technology supplier is more customer-oriented, rather than just focusing on the technology itself.

Amazon's unique culture

Since Amazon was my first real job, I had long assumed that the work culture elsewhere was similar to Amazon, but it wasn't.

Amazon has a unique culture that works really well for a fast-growing company. They encourage teams to be as independent as possible and cut down on organizational hierarchies and structures. Hierarchy seems unnatural to them.

They want to have self-organizing teams and hire people who really want to work independently and own the product. Young businesses especially need this, not followers or coders.

Amazon has a set of leadership principles, including 14 principles such as customer obsession, ownership, and deep mining, which drive their culture.

At Amazon, recruiting interviews are also largely centered around cultural fit, as an employee who doesn't fit the culture can be very disruptive to a small team. Amazon is very respectful of small teams, usually composed of 10 to 12 people, and each member knows their own tasks.

As the business grows, the role of the CTO changes, starting with being responsible for all technology-related matters, and gradually focusing on team management and ensuring that engineers can deliver the required technology and products.

Compared to the VP of engineering, the CTO is more concerned with the technical aspects such as building the right technology and using the right tools.

How did Amazon develop?

Amazon has gone through a series of changes internally. They create independent teams that look like startups and own their own goals and innovation agenda. However, in the past, Amazon has violated architectural principles in order to grow rapidly, resulting in a brittle back-end database infrastructure that could no longer grow.

To address this loss of efficiency, they turned to a service-oriented architecture, which splits the system into independent functional building blocks, or microservices.

However, as the team grows, each service needs to manage its own database, leading to increased communication but less innovation.

To improve the situation, they created a shared services platform, using virtualization technology and APIs to manage servers. They first built these technologies internally and then launched products externally, such as Amazon S3 and EC2, services that make storage and computing power programmable and scalable, very cost-effective for enterprises.

Amazon's goal is to achieve Internet-scale storage and computing power and provide services to various types of businesses.

The road to innovation

Amazon's innovation is divided into two levels:

  • The first is innovation at the team level. Each team is responsible for formulating its own innovation plan for the next year and completing tasks independently, such as improving the recommendation engine to reduce the number of returns. They are responsible for creating roadmaps, acquiring new data sources and engaging with customers in different ways.
  • Another layer is innovations that require significant capital investment, such as Kindle and Amazon Prime. These projects require substantial financial support. Amazon has set the rule that large capital investments will only be made if an innovative project has the potential to succeed and have a significant impact on the company's balance sheet.

Amazon realized that some of the decisions made early in the technology's development were smart, and that as they scaled, they had to revisit the architecture and develop software that adapted to change. This involves using multiple architectures and versions to handle technical challenges such as storage engines.

Also, like other companies, Amazon found that in addition to technical scaling, it also needed to address non-technological factors such as sales, solution architecture, technical account managers, and customer support. These factors are all necessary to build a successful company.

How to launch new services?

We expect all teams to be in close contact with customers, as approximately 95% of the features and services we provide are in response to direct customer needs. At the beginning, the earliest services we established met almost all expectations of customers, including basic IT infrastructure, storage, computing, database, network and security.

Over time, however, customers came up with various other needs. They want analytics capabilities, cloud technology, mobile development, and now blockchain and other technologies that they want to be able to use without having to manage them. Therefore, it becomes very important to help customers build the right features and tools.

When we launch new products and services, we follow a strong culture of minimum feature set launch (MVP). But that's just a starting point for building the technology you need to build your business. We can't just ship something flaky, we need to make sure it's stable and reliable. We then discuss the requirements for additional functionality with the customer.

In the initial stages of a product, we don't always know what additional features customers want. For example, when we launched DynamoDB, we didn't know customers wanted secondary indexes. We didn't offer that from the start, but it's clear that's what the client wanted. We observe how customers use the product by launching the service with a minimal feature set, iterating and adding new features and services incrementally.

For example, when we launched Lambda, it was a serverless environment, which made it easy to develop, just write code and deploy it to S3, without having to think about other things such as servers. You only pay for what you actually use and don't need to worry about things like idle time.

This way changes the development process, we can observe how customers use the product. They quickly started iterating with an X-ray-like debugging environment and using ladder functions to build more complex applications. We understand our customers' needs by observing their usage habits, for example in DynamoDB, we realized that secondary indexes are more important to customers than secondary data centers.

Basically, the customer redefines our roadmap and we start delivering the features that matter most to them. This is a very important part. Even if it looks like an MVP, we can't see it as an MVP because people will build their business on it and depend on it. Thus, a different cultural structure is formed around the product.

Last year we had a release count of 1400 new features and services, and that number will of course continue to grow as the number of teams grows. We use the same structure in AWS where each team works with a specific segment of customers and builds a roadmap based on their needs. As the number of services grows, so does the roadmap.

However, this is a rapidly advancing environment and the way software is built has changed dramatically. If we could decide how customers should develop software, we might still be doing it the way we did five or ten years ago. Instead, we need ways to develop software starting in 2020 or 2025 by working closely with our customers and letting them drive our innovation engine.

So instead of making decisions for our customers, we need to work closely with them and let them drive our innovation engine. We need to closely observe how our customers use our products and constantly iterate and improve based on their feedback.

In general, Amazon Web Services (AWS) adopts a dual level of team level and capital investment in innovation. By working closely with customers to understand their needs and observe their usage habits, AWS is able to provide features and services that meet customer needs. At the same time, AWS also invests a lot of capital in research and development and launch of new products, services and functions to meet changing market needs.

This innovative approach allows AWS to maintain a close relationship with customers to ensure that they can provide stable and reliable solutions that meet customer expectations. Through the release strategy based on the minimum function set and continuous iteration, AWS can quickly respond to customer needs and continue to provide more advanced functions and services.

Amazon's road to innovation is a process of constant evolution and development, always customer-centric. Through in-depth understanding of customer needs, observation of customer usage habits, and continuous investment in research and development, AWS is constantly promoting the advancement of technology and business, and providing customers with excellent cloud computing solutions.

Build customer-driven products

We're everywhere, whether it's starting with a customer or inside Amazon. As a technology company, we're very focused on developing what really matters to our customers. Although we are a heavy technology company, engineering and engineers also take risks.

We focus on products, not just technology. We want to know what we can do for our customers. We're committed to building amazing technology, but that's not the only thing that drives our actions. What we care about is solving our customers' problems.

To ensure our continued focus on our clients, we employ a process called working backwards. First, we write a press release that clearly and concisely describes what we're going to build. We then prepare a document with 20 frequently asked questions and answer them in plain and simple language. In more complex cases, we may need to iteratively revise these two documents until we are completely clear about what we are trying to build.

Next, we write user experience (UX) documents that describe in detail how customers use our product and how they interact with it. We also write user manuals, glossaries, and other related documentation.

In the end, we end up with a set of four documents that describe exactly what we're going to do.

As Amazon, we have always followed this principle: we will not be billed more than what we promised. We don't randomly add the functionality of the second version to the first version. We're focused on building the features we're committed to, and just that. This approach provides a strong structure for thinking about customer needs, product experience, and technology.

At Amazon's conference, we don't use slides or keynotes. We have a document called a six-page memo that everyone reads silently 30 minutes before the meeting starts. This memo is very important because it ensures that everyone has a clear understanding of what we are discussing.

Writing a story is difficult, so we encourage collaboration and feedback. We revise and refine this memo many times until we clearly describe a feature, product, or business area. After 30 minutes of reading, everyone in the room is on the same page, which contributes to a high-quality discussion.

Together, we have a unique culture and processes to ensure we remain focused on solving customer problems and delivering exceptional products and services.

container technology

More and more companies are skipping container technology, especially in the pursuit of a more microservices environment. One of the reasons why container technology has become popular is that it can easily realize the up and down expansion of components, which is in line with the concept of microservices. Many people are starting to break out containers from a monolithic stage for development, especially around serverless environments.

However, there were some issues before using container technology, especially before Fargate was delivered. You need to manage multiple containers running in multiple availability zones, and you need to map them to virtual machines. So, while containers are a great development choice, a lot of work is still required to run and manage them. To simplify this process, we offer a solution called Fargate, which basically removes all management of the underlying virtual machine and just puts the container in it and it runs.

In the future, I think there will be more and more tools, supporting platforms, and infrastructure developed around the ability to build more complex serverless environments. Better integration with other services will be one of the directions of development.

*Deep Tide Note: Fargate is a computing service provided by Amazon Web Services (AWS), which is a serverless computing engine. Fargate makes it easier for developers to manage and deploy containerized applications without concern for the underlying infrastructure and servers.

Container technology is a virtualization technology that enables rapid deployment and portability of applications by packaging applications and their dependencies into independent, portable containers. Container technology uses a container engine (such as Docker) to create, manage, and run containers, enabling applications to run in a consistent manner in different computing environments without worrying about differences in the underlying infrastructure. Containerization technology is widely used in modern application development and deployment, which provides higher flexibility, scalability and portability.

protect customers

However, I think security issues will come into focus. In the next five years, everyone should make security a top priority. Whether it is the CEO, CTO or engineer, we all need to be security aware and play the role of security engineers. We, as technologists and digital business leaders, should be embarrassed and outraged by the massive data breaches that have occurred every week for the past few years. Protecting customer data is our responsibility because without protecting customers, there is no business.

We need to start thinking about how to protect the data we collect from our customers, whether it's a car rental or other consumer service. Security needs to be part of the default, such as triggering security events in continuous integration and continuous deployment pipelines, ensuring that new open source libraries are reviewed and evaluated when they are added.

The development pipeline itself also needs to be secure and equipped with various automated tools for vulnerability testing. Especially in the fields of healthcare and finance, there are various regulations and regulatory requirements that need to be complied with.

In five years, I expect we'll all be highly aware of security issues and make protecting our customers a top priority. At Amazon, protecting our customers will always be our number one area of ​​investment, whether in terms of intellectual or financial capital.

Common mistakes startups make with AWS

First, for those with traditional data center experience, using AWS for the first time can lead to a lack of confidence. Despite the advantages of AWS in terms of elasticity and availability, we cannot realize its full potential without using higher-level services such as security, data analysis, and mobility, especially in the pursuit of high-reliability large-scale development aspect.

Second, it is critical to determine the type and goals of the company. There are two distinct company styles: fast-growing, high-volume customer companies that are less revenue-focused and invest heavily to expand quickly and potentially be acquired; and sustainable companies, Looking to build a long-term business and not just focus on acquisitions.

These two types of companies use AWS very differently. For companies pursuing rapid growth, they can use the capacity and services provided by AWS with more confidence, because they don't need to worry too much about cost. For companies pursuing sustainable development, they need to build a different architecture, pay more attention to cost control, and ensure that there is a clear relationship between cost and customer acquisition.

For startup founders, Jeff Bezos often draws a distinction between mercenaries and missionaries. A mercenary throws himself into a startup for the money, while a missionary does it for the love of the product. Both are effective ways to start a business, but the technical support and technical architecture built will be different.

So, figure out what type of company you are and choose the right tech support and architecture accordingly.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
2
Comments