Glow — damn simple private PaaS
My biggest issue with current offerings such as Kubernetes is value/complexity ratio. All of these pods, kubelets, groups, tons of error-prone configuration, dozens of daemons and yet it doesn’t solve any of the most basic things I care about:
- Transparent service discovery
2. Dynamic reconfiguration
3. Log aggregation
4. Basic monitoring
In my view a bare-bones PaaS that is useful, goes like this:
- An application is described by a simple YAML, stored in a central config repository:
- name which is the name of YAML file
- provided services as a list of ports (TCP/UDP)
- required services as a list of application:port — >local port entries
- image name and version in the artifact repository
- optional list of tags to classify the application
- optional config file location, the PaaS will update it atomically in real-time
2. An instance of application gets all of required services mapped to a distinct ports on loopback interface. A config file “magically” appears on the hardcoded path. Developer box, dev cluster, staging cluster, production cluster — all environments look the same to the app instance. Packaged as a container or not, started under debugger or not etc. it behaves the same.
3. The port mapped on loopback are listened by a smart proxy that balances traffic, handles (partial) outages of target application, service discovery, and retries with fallback policy.
4. Application instance logs to stdout or stderr, it gets forwarded to wherever appropriate.
5. Apps in the cluster are identified by name of the image + version of the image + # of instance.
6. External systems are exposed to the cluster apps just like another app and proxies work the same.
7. Exposing apps to the world is done again with a list of a simple YAMLs stored in a centralized config repository.
8. Since all communication goes through smart proxies, they accumulate metrics on all network events — connects/disconnects/traffic/request timings (for HTTP and other well-known protocols).
9. Deployment is simple process — there is a YAML for each app instance that is stored in the central configuration:
- application name
- cluster node to run at
- instance id, positive number (optional)
Stopping an app is simply deleting a file (or moving elsewhere).
This takes just enough of opinionated convention and leaves a bit flexibility to get the following:
- High developer productivity — works on your box the same as in the cloud.
2. Free of most configuration headaches — network dependencies. Yet you may still keep buisness confug in a single coherent place.
3. Reliability and network connectivity issues, log aggregation and monitoring is handled by infrastructure.
What is not covered:
- Auto distribution of app instances across cluster. I’ve never seen it work w/o extensive tweaking. Therefore it is best left to a script tailored to your infrastructure.
2. scaling, because.
Now the interesting part — is it too hard to implement? Maybe without intricate cluster managers and layers upon layers of abstraction, each with a unique configuration dance.