Category git

Dependency Update and Artifacts Promotion in Multi-repo Project

We all know Google employs a version tracking system that uses a single repository/depot. Every close-source google product that you love is tracked by this single repo, which is so...

Git as Version Vector

Git is one of the most widely used version control systems. Traditionally, a repository on git is considered as a complete history of the entire project in the form of...

Git: Branch off An Unmerged Branch While Committing Often - Disasters and Salvage

Committing often and pushing often has been advocated as good practice when using Git, which saves your latest work on remote even if your hard drive dies right after and...

Category web

How to Configure Applications for High Availability in Kubernetes

Pods in Kubernetes are the smallest orchestration unit and are ephemeral by definition: Deployment/StatefulSet/DaemonSet/ReplicaSet updates or patches Nodepool downscaling (compaction) or upgrades (cordoned and drained)

Service API Changes: Prefer Blue-green Update to Rolling Update

Summary To achieve zero-downtime service update, Kubernetes rolling update implies the API must be both forward and backward compatible. Forward compatibility is hard if at all makes sense. Blue-green update...

JWT + Third-party Oauth in Single Page App

Imagine you run a single page app at example.com that communicates with backends over restful API and is authenticated with JWT tokens managed by you, but identities are managed by...

System Design Interview: Scaling Single Server

Imagine your app is doing tremendously well with growing traffics. If there is a single server for your app, and the server is approaching its capacity, how would you scale...

Killer Apps of Message Queues

Message queues are an asynchronous inter-process communication protocol that gains much of its glory with the recent hypes in microservices. Senders and receivers do not interact with the middleware at...

Session Consistency in Replicated Frontend Servers

HTTP provides an abstraction of short connections. Unlike the continuous byte streams in TCP, exchanges between client and server over HTTP starts with a client request and ends with server...

Pagination Ordered by Secondary Keys on Sharded Stores

A common design for content display, pagination partitions information into multiple pages and serves one at a time. We have seen it in search results, message history, and cascading news...

Kip’s Warehouse: Building Scalable, Reliable, Consistent Web Application from the Ground Up

I have been working with another three wonderful people on the senior design project, which is a web application of an inventory management system, and the production is up at...

Service-Oriented Architecture: Why did Microservices Catch On

All teams will henceforth expose their data and functionality through service interfaces. There will be no other form of inter-process communication (IPC) allowed: no direct linking, no direct reads of...

Category microservices

Kube-proxy and mysterious DNS timeout

This post reviews how iptables-mode kube-proxy works, why some DNS requests to kube-dns were blackholed, and how to mitigate the issue.

Scaling Istio

In a large, busy cluster, how do you scale Istio to address Istio-proxy Container being OOM-Killed and Istiod crashes if too many connected istio-proxies?

JWT + Third-party Oauth in Single Page App

Imagine you run a single page app at example.com that communicates with backends over restful API and is authenticated with JWT tokens managed by you, but identities are managed by...

Istio: Noninvasive Governance of Microservices on Hybrid Cloud

As presented in my previous post, microservices are the state-of-the-art architecture for building scalable, highly-available, manageable backend.  No more 30-minute build time, single point of failure, and constant regression from...

Killer Apps of Message Queues

Message queues are an asynchronous inter-process communication protocol that gains much of its glory with the recent hypes in microservices. Senders and receivers do not interact with the middleware at...

Session Consistency in Replicated Frontend Servers

HTTP provides an abstraction of short connections. Unlike the continuous byte streams in TCP, exchanges between client and server over HTTP starts with a client request and ends with server...

Pagination Ordered by Secondary Keys on Sharded Stores

A common design for content display, pagination partitions information into multiple pages and serves one at a time. We have seen it in search results, message history, and cascading news...

Kip’s Warehouse: Building Scalable, Reliable, Consistent Web Application from the Ground Up

I have been working with another three wonderful people on the senior design project, which is a web application of an inventory management system, and the production is up at...

Service-Oriented Architecture: Why did Microservices Catch On

All teams will henceforth expose their data and functionality through service interfaces. There will be no other form of inter-process communication (IPC) allowed: no direct linking, no direct reads of...

Category distributed systems

A Brilliant Hack: Why does Layer 2/3 Checksum use 1’s Complement, Not 2’s

A super quick recap, one’s complement represents negative x by reverting every bit of x, while two’s complement negative x as one’s complement of x plus 1. Symbolically,

System Design Interview: Scaling Single Server

Imagine your app is doing tremendously well with growing traffics. If there is a single server for your app, and the server is approaching its capacity, how would you scale...

A Primer on Secure Communication Channels

In the world of internet, sending messages in clear text is like swimming naked. We would love some secure communication channels free from eavesdropping or tampering. Security as such is...

Git as Version Vector

Git is one of the most widely used version control systems. Traditionally, a repository on git is considered as a complete history of the entire project in the form of...

Sloppy Quorum And Eventual Consistency

Here is where we stand. Fisher-Lynch-Patterson has shown that consensus is not guaranteed in bounded time in a purely asynchronous network. The CAP theorem shows that from consistency, availability, and...

Reliable & Consistent Service: Linearizable RPC and Replicated State Machine

Remote Procedure Call (RPC) is a canonical structuring paradigm for client-server/request-response services.

Category signal processing

Fourier, Phasors, LTI and All That

We all share the sorrow and misery from that signal processing class.  You were thrown at some crazy formula, kind of know how to use them but probably never understand why...

Category networking

Kube-proxy and mysterious DNS timeout

This post reviews how iptables-mode kube-proxy works, why some DNS requests to kube-dns were blackholed, and how to mitigate the issue.

Scaling Istio

In a large, busy cluster, how do you scale Istio to address Istio-proxy Container being OOM-Killed and Istiod crashes if too many connected istio-proxies?

Work Around Max Count of Security Group Rules on EKS

AWS EKS on VPC networks need AWS Security Group Rules (SG) to receipt ingress traffic. But what if you reach the max rules count in your SG?

Layer-4 Load Balancer & Zero-downtime Autoscaling and Upgrade

Your Kubernetes cluster probably has a shared ingress for north-south traffic, coming from a cloud load balancer and lands on your favorite proxies like Envoy, or Istio gateways, or Nginx....

Kubernetes Networking From the First Principles

We go from containers and network namespace to Pod-to-Pod, Pod-to-Service, and external-client-to-Service networking.

The Good, Bad, and Ugly: Istio for Short-lived Pods

Kubernetes does not differentiate sidecars and application containers in a Pod. Hence, enabling Istio for short-running workloads imposes additional challenges to the conventional approach of injecting an Envoy sidecar to...

DNS, UDP, IP Anycast, and All That

DNS prefers UDP. There are times when DNS must run on TCP (request or response size exceeds a single packet, perhaps due to too many response records), but UDP is...

Lessons from Scaling GKE: L4 ILB Tops at 250 Nodes

My team at Cruise operates tens of Kubernetes clusters with 10,000s cores and 100s of TB of RAM. Since migration to GCP, we have hit several interesting scaling issues. One...

A Brilliant Hack: Why does Layer 2/3 Checksum use 1’s Complement, Not 2’s

A super quick recap, one’s complement represents negative x by reverting every bit of x, while two’s complement negative x as one’s complement of x plus 1. Symbolically,

JWT + Third-party Oauth in Single Page App

Imagine you run a single page app at example.com that communicates with backends over restful API and is authenticated with JWT tokens managed by you, but identities are managed by...

A Primer on Secure Communication Channels

In the world of internet, sending messages in clear text is like swimming naked. We would love some secure communication channels free from eavesdropping or tampering. Security as such is...

Sloppy Quorum And Eventual Consistency

Here is where we stand. Fisher-Lynch-Patterson has shown that consensus is not guaranteed in bounded time in a purely asynchronous network. The CAP theorem shows that from consistency, availability, and...

Category istio

Scaling Istio

In a large, busy cluster, how do you scale Istio to address Istio-proxy Container being OOM-Killed and Istiod crashes if too many connected istio-proxies?

The Good, Bad, and Ugly: Istio for Short-lived Pods

Kubernetes does not differentiate sidecars and application containers in a Pod. Hence, enabling Istio for short-running workloads imposes additional challenges to the conventional approach of injecting an Envoy sidecar to...

Istio: Noninvasive Governance of Microservices on Hybrid Cloud

As presented in my previous post, microservices are the state-of-the-art architecture for building scalable, highly-available, manageable backend.  No more 30-minute build time, single point of failure, and constant regression from...

Dependency Update and Artifacts Promotion in Multi-repo Project

We all know Google employs a version tracking system that uses a single repository/depot. Every close-source google product that you love is tracked by this single repo, which is so...

Category security

A Primer on Secure Communication Channels

In the world of internet, sending messages in clear text is like swimming naked. We would love some secure communication channels free from eavesdropping or tampering. Security as such is...

Category docker

Docker Multi-stage Build: Fast, Minimal and Secure Images

Introduced in version v17.05, multi-stage builds feature in Dockerfiles enables you to create smaller container images with better caching and smaller security footprint. Fundamentally, the new syntax allows one to...

Docker: The Container Metaphor with Profound Revolution

Many regard containers as a virtualization technology. They are missing out. Docker has much more to offer. It is a graceful solution to some of the most painful experience in...

Category kubernetes

Kube-proxy and mysterious DNS timeout

This post reviews how iptables-mode kube-proxy works, why some DNS requests to kube-dns were blackholed, and how to mitigate the issue.

Scaling Istio

In a large, busy cluster, how do you scale Istio to address Istio-proxy Container being OOM-Killed and Istiod crashes if too many connected istio-proxies?

Work Around Max Count of Security Group Rules on EKS

AWS EKS on VPC networks need AWS Security Group Rules (SG) to receipt ingress traffic. But what if you reach the max rules count in your SG?

Layer-4 Load Balancer & Zero-downtime Autoscaling and Upgrade

Your Kubernetes cluster probably has a shared ingress for north-south traffic, coming from a cloud load balancer and lands on your favorite proxies like Envoy, or Istio gateways, or Nginx....

Kubernetes Networking From the First Principles

We go from containers and network namespace to Pod-to-Pod, Pod-to-Service, and external-client-to-Service networking.

Lessons from Scaling GKE: L4 ILB Tops at 250 Nodes

My team at Cruise operates tens of Kubernetes clusters with 10,000s cores and 100s of TB of RAM. Since migration to GCP, we have hit several interesting scaling issues. One...

How to Configure Applications for High Availability in Kubernetes

Pods in Kubernetes are the smallest orchestration unit and are ephemeral by definition: Deployment/StatefulSet/DaemonSet/ReplicaSet updates or patches Nodepool downscaling (compaction) or upgrades (cordoned and drained)

Service API Changes: Prefer Blue-green Update to Rolling Update

Summary To achieve zero-downtime service update, Kubernetes rolling update implies the API must be both forward and backward compatible. Forward compatibility is hard if at all makes sense. Blue-green update...

CD Tricks for Kubernetes Deployment + ConfigMap

It is common to extract the application configuration to a separate file as a runtime dependency of the container image that includes the application binary. As a result, the same...

Docker Multi-stage Build: Fast, Minimal and Secure Images

Introduced in version v17.05, multi-stage builds feature in Dockerfiles enables you to create smaller container images with better caching and smaller security footprint. Fundamentally, the new syntax allows one to...

Category operation

Navigating Shell for Productivity and Profit

I hope you find inspirations from these pretty neat shell tricks and my shell setup.

How to Configure Applications for High Availability in Kubernetes

Pods in Kubernetes are the smallest orchestration unit and are ephemeral by definition: Deployment/StatefulSet/DaemonSet/ReplicaSet updates or patches Nodepool downscaling (compaction) or upgrades (cordoned and drained)

Service API Changes: Prefer Blue-green Update to Rolling Update

Summary To achieve zero-downtime service update, Kubernetes rolling update implies the API must be both forward and backward compatible. Forward compatibility is hard if at all makes sense. Blue-green update...

CD Tricks for Kubernetes Deployment + ConfigMap

It is common to extract the application configuration to a separate file as a runtime dependency of the container image that includes the application binary. As a result, the same...

Category career

More Career Advices

Make sure to check out the previous post: Advices I wish I got at the start of my career.

Life and Investment Through the Lens of Uncertainty

Disclaimer: Opinions are my own. Not investment advice.

Software Engineering Levels and Promotion

This post explains the expectation of each engineering level in the most concise and company-agnostic way and reveals the steps towards promotion.

What to Talk about in Effective 1-on-1s

Unlike in school when we get grades on every assignment and in every course, we get less frequent feedback in professional life, usually once or twice per year, which is...

Advices I wish I got at the start of my career

When I was a kid playing chess with my dad, he sometimes would offer me hints on some good moves. I would never make those moves. I would rather make...

Category go

Parameters with Defaults in Go: Functional Options

Unlike C++ or Python, Go does not support function parameters with default values if unspecified. Specifically, we want that

Category cloud

Kube-proxy and mysterious DNS timeout

This post reviews how iptables-mode kube-proxy works, why some DNS requests to kube-dns were blackholed, and how to mitigate the issue.

Scaling Istio

In a large, busy cluster, how do you scale Istio to address Istio-proxy Container being OOM-Killed and Istiod crashes if too many connected istio-proxies?

Work Around Max Count of Security Group Rules on EKS

AWS EKS on VPC networks need AWS Security Group Rules (SG) to receipt ingress traffic. But what if you reach the max rules count in your SG?

Layer-4 Load Balancer & Zero-downtime Autoscaling and Upgrade

Your Kubernetes cluster probably has a shared ingress for north-south traffic, coming from a cloud load balancer and lands on your favorite proxies like Envoy, or Istio gateways, or Nginx....

Lessons from Scaling GKE: L4 ILB Tops at 250 Nodes

My team at Cruise operates tens of Kubernetes clusters with 10,000s cores and 100s of TB of RAM. Since migration to GCP, we have hit several interesting scaling issues. One...

Category investment

Notes: Venture Deals

Before Fundraise: Allow minimum three to six months to raise money. Have a clean cut from last job to avoid IP disputes. Prepare data site (Certificate of Incorporation, Bylaws, board...

Life and Investment Through the Lens of Uncertainty

Disclaimer: Opinions are my own. Not investment advice.

Category startup

Notes: The Lean Startup

Careful planning and execution work for general management but not for startups. Perfect execution is futile if you end up building something nobody wants (waste). The real progress for startups...

Notes: Venture Deals

Before Fundraise: Allow minimum three to six months to raise money. Have a clean cut from last job to avoid IP disputes. Prepare data site (Certificate of Incorporation, Bylaws, board...

Enterprise Sales

How to do product-led growth and hands-on outbound sales at the same time?

Interviewing Adrien Treuille, Founder CEO of Streamlit

Streamlit, about to raise its Series-C, was acquired by Snowflake for $800M in March 2022. In this conversation with Adrien, we chatted about OSS metrics, licenses, open-core vs freemium vs...

Accounting Advice for Founders

Notes derived from a guest lecture by Danny Wallace, Partner at PwC’s Silicon Valley practice. For informational purposes only. Errors and omissions are my own.

Intellectual Property and Entrepreneurship

Notes on Intellectual Property (IP) law for founders and busy professionals. Not legal advice. For informational purposes only. Laws can change, so this article may contain dated information. Always consult...

Category oss

Interviewing Adrien Treuille, Founder CEO of Streamlit

Streamlit, about to raise its Series-C, was acquired by Snowflake for $800M in March 2022. In this conversation with Adrien, we chatted about OSS metrics, licenses, open-core vs freemium vs...