NodeUp Distribution Deep Dive

Reproducibility: ensuring build results are consistent over time on different machines.
Determinism: ensuring packages/modules are exact given identical package.json and lockfile. Important given nested dependencies in JavaScript development.

File size and speed of builds relevant too.

npm lockfile (package-lock.json) and yarn lockfile (yarn.lock). These should be included in version control to enable reproducibility acrosss machines.

Yarn is deterministic provided Yarn version is identical.

Can use --mutex flag in yarn to ensure only one instance runs at any given time (and avoid conflicts) if running multiple instances of yarn as same user on same machine (i.e. in Continuous Integration scenario).

Infrastructure configuration and orchestration

Infrastructure-as-code (IAC) is an approach to managing software in cloud servers.

IAC tools like Chef, Puppet, Ansible, and SaltStack are all configuration management tools: they are designed to install and manage software on existing servers.
IAC tools like CloudFormation by Amazon and Terraform by Hashicorp are orchestration tools: they are designed to provision the servers themselves, leaving the job of configuring those servers to other tools.

Some tools provide aspects of both.

Chef and Ansible encourage a procedural style where you write code that specifies, step-by-step, how to to achieve some desired end state. Terraform, CloudFormation, SaltStack, and Puppet all encourage a more declarative style where you write code that specifies your desired end state, and the IAC tool itself is responsible for figuring out how to achieve that state.

Configuration management tools such as Chef, Puppet, Ansible, and SaltStack typically default to a mutable infrastructure paradigm. With an Immutable infrastructure like Terraform, every change to the images created by Docker or Packer, is actually a deployment of a new server. This approach reduces the possibility of configuration drift bugs (where servers have slightly different configurations because they have been changed in-place).

Server configuration management tools

Configuration management tools are designed to install and manage software on existing servers (c/f orchestration tools which provision the servers and leave configuring to other tools).

Configuration tools include Puppet, Chef, Ansible and SaltStack.


Puppet performs administrative tasks (such as adding users, installing packages, and updating server configurations) based on a centralized specification.

Puppet is a DSL for configuring systems (not programming).

Puppet uses a master server / client agent model and is written in Ruby.

How to install Puppet on Digital Ocean


Chef is a configuration management tool designed to bring automation to your entire infrastructure.

Chef uses a master server / client agent model like Puppet and is also written in Ruby.

Getting started with Chef on Digital Ocean


Unlike Puppet and Chef, Ansible performs all functions over SSH and is written in Python. A master server / client agent model is still used to manage machines.

Ansible can use the Digital Ocean API to create and manage droplets.

How to install Ansible on Digital Ocean
How to create Digital Ocean droplets using Ansible


Software to automate the management and configuration of any infrastructure or application at scale.

Intro to SaltStack
How to install SaltStack on Digital Ocean

Server orchestration tools

Orchestration tools provision servers (and leave the job of configuring the servers to other tools).


Terraform is a tool for building and managing infrastructure in an organized way.

Infrastructure-as-code (IAC) can greatly reduce the overhead of creating servers in the cloud. Terraform is a leader in IAC (along with Cloudformation by Amazon).

Terraform with Digital Ocean
Terraform with Digital Ocean



Docker provides reproducible deploys: build once, deploy many times from that image.

How to install and use Docker
How Containerise and use Nginx
How to create a cluster of Docker containers on CentOS.

Developing and testing microservices with Docker
Developing microservices - Node, React and Docker

Docker and databases

docker-compose is broken.

Docker Swarm and Docker Compose not muture. Use Docker strictly for building your container images, then Kube literally takes care of everything else

Bazel can be used for deterministic Docker builds.


Kubernetes is powerful container management software. Essential features like service discovery, automatic load-balancing, container replication and more are built in. Plus, it’s all powered via an HTTP API.

If you already have Docker containers that you'd like to launch and load balance, Kubernetes is the best way to run them.

What is Kubernetes
Introduction to Kubernetes

" If your cloud footprint isn't costing you let's say >= $50k/mo or so, I might argue that the engineering effort to Kubernetize your things is probably better spent in more direct ways on your product."





Consul allows services to register with each other via DNS or HTTP.

An introduction to Consul: a service discovery system

Greenfields setups

Greenfields infrastructure scenario:

* set up tests
* set up Docker images
* set up docker-compose for easy local deployments
* do lots of work to make the app behave inside Docker
* take control of AWS and IAM using CloudFormation, making it easy to set up and roll back permissions
* bootstrap KOPS/K8s with CloudFormation
* set up Jenkins (or Travis) CI on top of K8s
* K8s deployments of the app for CI
* staging K8s deployments
* K8s in production

Greenfields infrastructure scenario

I've started using K8s and so far so good. Minikube makes local dev easy. However, I use K8s with ansible. So now I can:
* Build a fully baked container for local use
* Build a container that mounts my local filesystem for the code directory so I can have a sane dev-test cycle with hot-code reloading
* Mount a directory when running locally containing credentials for accessing Google Cloud services
* Spin up a K8s cluster in arbitrary GKE accounts, template and deploy my deployment and service accounts, pin some with external IPs, etc
* Push containers to GCR
* Deploy my containers wherever - shared directory ones can only run locally.
This is for an architecture with about 4 microservices and that will probably grow.
Now I'm working on an ansible playbook to zip the code directory of one of my microservices, upload it to GCS, then run a build container on my cluster to build the docker image for my microservice and push to GCR from there so I don't have to waste time pushing large containers up to GCR. Once this is done I'll look at promoting containers through dev/test/prod environments since all config will be done with env vars.
I've never read about using ansible with K8s, but to me it's a no-brainer. Most people seem to cobble together bash scripts, but using the best of both has really led to a good experience.
The benefit we hope to get is isolation and cost efficiency.