Wednesday, September 13, 2017

Apps Script Navigator: UrlFetchApp Empowered with Cookie Management, Re-login and Much More!

If you are a fan of Google Apps Script, you would definitely have played around with its UrlFetchApp module, offering a convenient HTTP(S) client interface. As a frequent user of the module, although I admit that it is really cool, I always felt I needed a richer client, capable of handling stuff like automatic cookie management, proper redirection and logout detection with automatic re-login.

Hence came into being Navigator, a wrapper for UrlFetchApp that incorporates some of the basic HTTP client operations into convenient wrapper methods while addressing some of the pain points of the original UrlFetchApp module.

Unfortunately due to a bug in Apps Script framework, Navigator is not yet fully compatible with Apps Script editor's autocompletion feature. Hence for now we will have to depend on the comments in the source as documentation. Here is a summary of the features and utilities of the module:

A Navigator can be constructed using:

/**
 * a Navigator
 * invocation: Navigator.navigator(baseUrl)
 * returns a new Navigator object based on the given base URL (protocol://host:port/resource/path).
 * @class 
 * @param {baseUrl} default base URL for outbound requests
 * @implements {Navigator}
 * @return {Navigator} a new Navigator
 */
function Navigator(baseUrl)

The Navigator object currently supports the following methods:

/**
 * executes a GET request
 * @param {path} the destination path (relative or absolute)
 * @return the response payload
 */
Navigator.prototype.doGet

/**
 * executes a POST request
 * @param {path} the destination path (relative or absolute)
 * @param {payload} the payload (will be {@link UrlFetchApp}-escaped unless a String) to be sent with the
 *                  request; i.e. sent verbatim in case it is a string, or with escaping otherwise
 * @param {headers} an array of key-value pair headers to be sent with the request
 * @return the response payload
 */
Navigator.prototype.doPost

/**
 * executes an arbitrary request in {@link UrlFetchApp} style, for cases where you want to directly
 * manipulate certain options being passed to UrlFetchApp.fetch. However this still provides the
 * built-in enhancements of Navigator such as automatic cookie management.
 * @param {path} the destination path (relative or absolute)
 * @param {options} a {@link UrlFetchApp}-compatible options object
 * @return the response payload
 */
Navigator.prototype.sendRequest

The following configurator methods decide the behaviour of various features of Navigator:

/**
 * if set, cookies will be saved in {@link PropertiesService.getScriptProperties()}
 * @param {saveCookies} true if properties should be saved
 */
Navigator.prototype.setSaveCookies

/**
 * if saveCookies is set, decides the base username for saving cookies in the properties store (key {cookieusername}_cookie_{cookiename})
 * @param {cookieUsername} base username for cookies
 */
Navigator.prototype.setCookieUsername

/**
 * updates the local cookie cache with cookies received from a request, and returns the computed 'Cookie' header
 * @param {cookie} the current 'Cookie' header (string)
 * @param {rawCook} the cookie string ('Set-Cookie' header) received in a request
 * @return the updated 'Cookie' header string
 */
Navigator.prototype.updateCookies

/**
 * sets an absolute (starting with protocol://) or relative path for login requests to base website
 * @param {loginPath} path for login requests
 */
Navigator.prototype.setLoginPath

/**
 * sets the payload to be submitted during login (for automatic relogin)
 * @param {loginPayload} the login request payload
 */
Navigator.prototype.setLoginPayload

/**
 * if set, an automatic relogin will be performed whenever this content fragment is encountered in the response body
 * @param {logoutIndicator} content indicating a logout, for attempting relogin
 */
Navigator.prototype.setLogoutIndicator

/**
 * if set, when an automatic login is executed during a URL request, the original request will be replayed after login
 * @param {refetchOnLogin} true if refetch is required in case of a relogin
 */
Navigator.prototype.setRefetchOnLogin

/**
 * if set, logs would be generated for each request
 * @param {debug} true if request debug logging should be enabled
 */
Navigator.prototype.setDebug

The internal state of Navigator (such as the currently active cookies) can be obtained via the following methods:

/**
 * returns current 'Cookie' header
 * @return current 'Cookie' header string
 */
Navigator.prototype.getCookies

/**
 * returns headers received in the last navigation
 * @return headers from the last navigations
 */
Navigator.prototype.getLastHeaders

Navigator also provides some handy utility functions for extracting content from navigated pages, including those in the vicinity of HTML tags:

/**
 * similar to {@link extract} but is specialized for extracting form field values ("value" attributes)
 * @param {body} the HTML payload string
 * @param {locator} locator of the form field to be extracted (appearing before value)
 * @return value of the form field
 */
function getFormParam(body, locator)

/**
 * extracts a given tag attribute from a HTML payload based on a given locator; assumes locator appears before the attribute
 * @param {body} the HTML payload string
 * @param {key} key of the tag attribute
 * @param {locator} locator of the form field to be extracted (appearing before key)
 * @return value of the form field
 */
function extract(body, key, locator)

/**
 * similar to {@link extract} but performs a reverse match (for cases where the locator appears after the attribute)
 * @param {body} the HTML payload string
 * @param {key} key of the tag attribute
 * @param {locator} locator of the form field to be extracted (appearing after key)
 * @return value of the form field
 */
function extractReverse(body, key, locator)

Here are a few snippets utilizing the above features. I have masked the actual URLs and some of the parameters being passed, but you could get the hang of Navigator's usage.

Setting up automatic login and cookie saving:

var nav = new Navigator.Navigator("http://www.example.com");
nav.setSaveCookies(true);
nav.setCookieUsername(email);

// login form at http://www.example.com/login
nav.setLoginPath("login");

// will output all URLs, headers, cookies and payloads via Logger
nav.setDebug(true);

// for static login forms, this payload will be submitted during automatic log-in (if enabled)
nav.setLoginPayload({
	email: email,
	password: password
});

// only logged-out pages contain this; if we see this, we know we're logged out
nav.setLogoutIndicator("password_reset");

// if you request /home, and Navigator happens to find itself logged out and logs in again,
// setting this to true will make Navigator re-fetch /home right after the re-login
nav.setRefetchOnLogin(true);

// try #1; will automatically log in if stored cookies are already expired
var str = nav.doGet("daily");
if (str.indexOf("Get daily prize now") > 0) {
	str = nav.doGet("daily"); // try #2
	if (str.indexOf("Get daily prize now") > 0) {
		// notify failure
	}
}

Extracting HTML and form parameters for composing complex payloads:

// n is a Navigator

/* this will encode the payload as a regular form submission, just like UrlFetchApp
   Content-Type has no significance; if you want to send JSON you have to
   JSON.stringify() the payload yourself and pass the resulting string as the payload */

r = n.doPost("submit_path", {
	name: "name",
	email: email,
	"X-CSRF-Token": Navigator.extractReverse(s, "content", 'itemprop="csrf-token"')
}, {
	"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
	"X-Requested-With": "XMLHttpRequest"
});

/* sometimes, due to issues like empty param values (https://issuetracker.google.com/issues/36762225)
   you will have to do the encoding on your own, in which case you can directly submit a string payload
   which will be passed to UrlFetchApp verbatim (with escaping disabled) */

var payload = "authenticity_token=" +
	encodeURIComponent(Navigator.getFormParam(s, 'name="authenticity_token"')) +
	"&timestamp=" + Navigator.getFormParam(s, 'name="timestamp"') +
	"&spinner=" + Navigator.getFormParam(s, "spinner") + "&" +
	Navigator.extract(s, "name", "js-form-login") + "=USERNAME&login=&" +
	Navigator.extract(s, "name", "js-form-password") + "=PASSWORD&password=";

s = n.doPost("user_sessions", payload, {
	"Origin": "https://www.example.com",
	"X-CSRF-Token": Navigator.extractReverse(s, "content", 'itemprop="csrf-token"'),
	"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
	"X-Requested-With": "XMLHttpRequest"
});

Obtaining headers and cookies retrieved in the last response:

Logger.log(n.getLastHeaders());
Logger.log(n.getCookies());

You can use Navigator in your own Apps Script project, by adding it as an external library (Project key: MjQYH5uJpvHxeouBGoyDzheZkF1ZLDPsS). The source is also available via this Apps Script project, in case you are interested in peeking at the innards, or want to customize it with your own gimmicks. Should you decide that you need more, which may also be useful for others (including me), please do share your idea here (along with the implementation, if you already have one) so that I can incorporate it to Navigator.

Happy scripting!

Monday, September 11, 2017

Logging in Style: log4j 2, Contextuality, Auto-cleanup... All with No Strings Attached!

Logging—maintaining a temporal trace of operations—is vital for any mission-critical system, no matter how big or small. Same was the case with our Project-X framework, which is why we wanted to get it right, right from the beginning.

Contextual logging—where each log line automatically records its originating logical context, such as whether it came from a specific unit or from the base framework—was something we have been looking forward to, based on our experience with logging on the legendary UltraESB.

We already knew log4j2 was offering contextual logging with its CloseableThreadContext implementation, with almost everything that we need; but we needed more:

  1. We needed a proper log code governance mechanism where each log line contains a unique log code, indicating the subsystem, module (package) and even an exact "index" of the specific log statement, so that we would no longer need to grep through the whole codebase to find out where the bugger came from.
  2. We needed to inject environmental variables and system properties, with a certain prefix, to be automatically injected to the logging context, so that specific applications could inject their runtime parameters to the logs (such as cluster ID in case of our Integration Platform).

We also needed to be API-independent of log4j2, as we should retain the freedom to detach from log4j2 and utilize a different logging framework (such as logback) in case we need to. While we could have utilized a third-party wrapper such as SLF4J we couldn't find a wrapper that could readily fulfill all our demands.

Hence, as with the previous UltraESB, we wrapped log4j2 with x-logging, our own logging implementation. x-logging consists of an API and a set of bindings to real logging frameworks (like log4j2 and logback), one of which can be plugged in dynamically at server startup time with Java's dear old ServiceLoader mechanism. This helped us to avoid leaking of log4j2-specific stuff into our implementations, as the log4j2-based implementation (and hence log4j2 itself) could be completely removed from the set of compile-time dependencies.

Ruwan from our team, who was also the originator of Project-X, hacked around with log4j2 for some time, and finally came up with a cool design to automatically propagate the current context of a log line, i.e. whether it originated from the platform (system, a.k.a. engine) or from a deployed project, and if it's the latter, other metadata of the project (such as the version). The coolest part was that this context automatically gets cleaned up once execution leaves that particular context.

If you are familiar with CloseableThreadContext, this may all sound quite simple. For the rest of the crowd, it would be enough to mention that CloseableThreadContext facilitates injecting key-value pairs to the logging context, such that when the context is closed, only the ones injected in the current context get cleaned up. The injected values are automatically made available to the logging context (ThreadContext) of the calling thread; or, in English, every log line printed by that thread sees the parameter in its thread context (or MDC in old-school jargon).

Okay, I admit the above might have been a bit hard to understand. Perhaps a sample snippet may do a better job:

// assume we are walking in, with nothing useful inside the context

try (CloseableThreadContext.Instance level1 = CloseableThreadContext.put("level", "1")) {

    // now the context has "1" as "level"
    logger.debug("Commencing operation"); // will see {level=1} as the context
    // let's also put in a "clearance" value
    level1.put("clearance", "nypd");
    // now, any log lines would see {level=1,clearance=nypd}

    // let's go deeper
    try (CloseableThreadContext.Instance level2 = CloseableThreadContext.put("level", "2").put("clearance", "fbi")) {

        // now both of the above "level" and "clearance" values are "masked" by the new ones
        // and yes, you can chain together the context mutations
        logger.debug("Commencing investigation"); // will see {level=2,clearance=fbi}

        // putting in some more
        level2.put("access", "privileged");
        // now context is {level=2,clearance=fbi,access=privileged}

        // still deeper...
        try (CloseableThreadContext.Instance level3 = CloseableThreadContext.put("level", "3").put("clearance", "cia")) {

            // "level" and "clearance" are overridden, but "access" remains unchanged
            logger.debug("Commencing consipracy"); // {level=3,clearance=cia,access=privileged}

        }

        // cool thing is, once you're out of the level3 block, the context will be restored to that of level2
        // (thanks to the AutoCloseable nature of CloseableThreadContext.Instance)

        logger.debug("Back to investigation"); // {level=2,clearance=fbi,access=privileged}
    }

    // same for exiting level 2
    logger.debug("Back to operation"); // {level=1,clearance=nypd}; access is gone!

}

logger.debug("Back to square one"); // {}; oh no, all gone!

This mechanism was ideal for our use, as we needed to include the current execution context of a thread along with every log line generated by that thread:

  1. In Project-X, the underlying engine of UltraESB-X, a worker threadpool maintained at the base framework level is responsible for processing inbound messages on behalf of an integration flow belonging to a particular project.
  2. We consider the thread to be in the project's context only after the message has been injected to the ingress connector of a particular integration flow. The worker thread is supposed to do quite a bit of work before that, all of which would be considered to belong to a system context.
  3. We generate logs throughout the whole process, so they should automatically get tagged with the appropriate context.
  4. Moreover, since we have specific error codes for each log line, we need to open up a new context each time we actually output a log line, which would contain the required log code in addition to the other context parameters.

So, the life of a thread in the pool would be an endless loop of something like:

// wake up from thread pool

// do system level stuff

loggerA.debug(143, "Now I'm doing this cool thing : {}", param);

try (CloseableThreadContext.Instance projectCtx = CloseableThreadContext.put("project", project.getName())
     .put("version", project.getVersion())) {

    // do project level stuff

    loggerM.debug(78, "About to get busy : {}", param);

    // more stuff, tra la la la
}

// back to system level, do still more stuff

// jump back to thread pool and have some sleep

Internally, loggerA, loggerM and others will ultimately invoke a logImpl(code, message, params) method:

// context already has system/project info,
// logger already has a pre-computed codePrefix

try (CloseableThreadContext.Instance logCtx = CloseableThreadContext.put("logcode", codePrefix + code)) {
    // publish the actual log line
}

// only "logcode" cleared from the context, others remain intact

We simulated this behaviour without binding to log4j2, by introducing a CloseableContext interface whose log4j2 variant (Log4j2CloseableContext, obviously) will manipulate CloseableThreadContext instances in the same manner:

import java.io.Closeable;

public interface CloseableContext extends Closeable {

    CloseableContext append(final String key, final String value);

    void close();
}

and:

import org.adroitlogic.x.logging.CloseableContext;
import org.apache.logging.log4j.CloseableThreadContext;

public class Log4j2CloseableContext implements CloseableContext {

    private final CloseableThreadContext.Instance ctx;

    /
     * Creates an instance wrapping a new default MDC instance
     */
    Log4j2CloseableContext() {
        this.ctx = CloseableThreadContext.put("impl", "project-x");
    }

    /
     * Adds the provided key-value pair to the currently active log4j logging (thread) context
     *
     * @param key   the key to be inserted into the context
     * @param value the value to be inserted, corresponding to {@code key}
     * @return the current instance, wrapping the same logging context
     */
    @Override
    public CloseableContext append(String key, String value) {
        ctx.put(key, value);
        return this;
    }

    /
     * Closes the log4j logging context wrapped by the current instance
     */
    @Override
    public void close() {
        ctx.close();
    }
}

Now, all we have to do is to open up an appropriate context via a nice management interface, LogContextProvider:

// system context is active by default

...

try (CloseableContext projectCtx = LogContextProvider.forProject(project.getName(), project.getVersion())) {

    // now in project context
}

// back to system context

And in logImpl:

try (CloseableContext logCtx = LogContextProvider.overlayContext("logcode", codePrefix + code)) {
    // call the underlying logging framework
}

Since we load the CloseableContext implementation together with the logger binding (via ServiceLoader), we know that LogContextProvider will ultimately end up invoking the correct implementation.

And that's the story of contextual logging in our x-logging framework.

Maybe we could also explain our log code governance approach in a future post; until then, happy logging!

Sunday, September 10, 2017

How to Get Kubernetes Running - On Your Own Ubuntu Machine!

Note: This article largely borrows from my previous writeup on installing K8s 1.7 on CentOS.

Getting a local K8s cluster up and running is one of the first "baby steps" of toddling into the K8s ecosystem. While it would often be considered easier (and safer) to get K8s set up in a cluster of virtual machines (VMs), it does not really give you the same degree of advantage and flexibility as running K8s "natively" on your own host.

That is why, when we decided to upgrade our Integration Platform for K8s 1.7 compatibility, I decided to go with a native K8s installation for my dev environment, a laptop running Ubuntu 16.04 on 16GB RAM and a ?? CPU. While I could run three—at most four—reasonably powerful VMs in there as my dev K8s cluster, I could do much better in terms of resource saving (the cluster would be just a few low-footprint services, rather than resource-eating VMs) as well as ease of operation and management (start the services, and the cluster is ready in seconds). Whenever I wanted to try multi-node stuff like node groups or zones with fail-over I could simply hook up one or two secondary (worker) VMs to get things going, and shut them down when I'm done.

I had already been running K8s 1.2 on my machine, but the upgrade would not have been easy as it's a hyperjump from 1.2 to 1.7. Luckily, the K8s guys had written their installation scripts in an amazingly structured and intuitive way, so I could get everything running with around one day's struggle (most of which went into understanding the command flow and modifying it to suit Ubuntu, and my particular requirements).

I could easily have utilized the official kubeadm guide or the Juju-based installer for Ubuntu, but I wanted to get at least a basic idea of how things get glued together; additionally I also wanted an easily reproducable installation from scratch, with minimal dependencies on external packages or installers, so I could upgrade my cluster any time I desire, by directly building the latest stable—or beta or even alpha, if it comes to that—directly off the K8s source.

I started with the CentOS cluster installer, and gradually modified it to suit Ubuntu 16.04 (luckily the changes were minimal). Most of the things were in line with the installation for CentOS, including artifact build (make at the source root) and modifications to exclude custom downloads for etcd, flanneld and docker (I built the former two from their sources, and the latter I had already installed via the apt package manager). However the service configuration scripts had to be modified slightly to suit the Ubuntu (more precisely Debian) filesystem.

In my case I run apiserver on port 8090 rather than 8080 (to leave room for other application servers that are fond of 8080), hence I had to do some additional changes to propagate the port change throughout the K8s platform.

So, in summary, I had to utilize the following set of patches to get everything in place. If necessary, you could utilize them by applying them on top of the v1.7.2-beta.0 tag (possibly even a different tag) of the K8s source using the git apply command.

  1. Skipping etcd, flannel and docker binaries:

    diff --git a/cluster/centos/build.sh b/cluster/centos/build.sh
    index 5d31437..df057e4 100755
    --- a/cluster/centos/build.sh
    +++ b/cluster/centos/build.sh
    @@ -42,19 +42,6 @@ function clean-up() {
     function download-releases() {
       rm -rf ${RELEASES_DIR}
       mkdir -p ${RELEASES_DIR}
    -
    -  echo "Download flannel release v${FLANNEL_VERSION} ..."
    -  curl -L ${FLANNEL_DOWNLOAD_URL} -o ${RELEASES_DIR}/flannel.tar.gz
    -
    -  echo "Download etcd release v${ETCD_VERSION} ..."
    -  curl -L ${ETCD_DOWNLOAD_URL} -o ${RELEASES_DIR}/etcd.tar.gz
    -
    -  echo "Download kubernetes release v${K8S_VERSION} ..."
    -  curl -L ${K8S_CLIENT_DOWNLOAD_URL} -o ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz
    -  curl -L ${K8S_SERVER_DOWNLOAD_URL} -o ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz
    -
    -  echo "Download docker release v${DOCKER_VERSION} ..."
    -  curl -L ${DOCKER_DOWNLOAD_URL} -o ${RELEASES_DIR}/docker.tar.gz
     }
     
     function unpack-releases() {
    @@ -80,19 +67,12 @@ function unpack-releases() {
       fi
     
       # k8s
    -  if [[ -f ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz ]] ; then
    -    tar xzf ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz -C ${RELEASES_DIR}
         cp ${RELEASES_DIR}/kubernetes/client/bin/kubectl ${BINARY_DIR}
    -  fi
    -
    -  if [[ -f ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz ]] ; then
    -    tar xzf ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz -C ${RELEASES_DIR}
         cp ${RELEASES_DIR}/kubernetes/server/bin/kube-apiserver \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-controller-manager \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-scheduler ${BINARY_DIR}/master/bin
         cp ${RELEASES_DIR}/kubernetes/server/bin/kubelet \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-proxy ${BINARY_DIR}/node/bin
    -  fi
     
       # docker
       if [[ -f ${RELEASES_DIR}/docker.tar.gz ]]; then
    diff --git a/cluster/centos/config-build.sh b/cluster/centos/config-build.sh
    index 4887bc1..39a2b25 100755
    --- a/cluster/centos/config-build.sh
    +++ b/cluster/centos/config-build.sh
    @@ -23,13 +23,13 @@ RELEASES_DIR=${RELEASES_DIR:-/tmp/downloads}
     DOCKER_VERSION=${DOCKER_VERSION:-"1.12.1"}
     
     # Define flannel version to use.
    -FLANNEL_VERSION=${FLANNEL_VERSION:-"0.6.1"}
    +FLANNEL_VERSION=${FLANNEL_VERSION:-"0.8.0"}
     
     # Define etcd version to use.
    -ETCD_VERSION=${ETCD_VERSION:-"3.0.9"}
    +ETCD_VERSION=${ETCD_VERSION:-"3.2.2"}
     
     # Define k8s version to use.
    -K8S_VERSION=${K8S_VERSION:-"1.3.7"}
    +K8S_VERSION=${K8S_VERSION:-"1.7.0"}
     
     DOCKER_DOWNLOAD_URL=\
     "https://get.docker.com/builds/Linux/x86_64/docker-${DOCKER_VERSION}.tgz"
    diff --git a/cluster/kube-up.sh b/cluster/kube-up.sh
    index 7877fb9..9e793ce 100755
    --- a/cluster/kube-up.sh
    +++ b/cluster/kube-up.sh
    @@ -40,8 +40,6 @@ fi
     
     echo "... calling verify-prereqs" >&2
     verify-prereqs
    -echo "... calling verify-kube-binaries" >&2
    -verify-kube-binaries
     
     if [[ "${KUBE_STAGE_IMAGES:-}" == "true" ]]; then
       echo "... staging images" >&2
    
  2. Changing apiserver listen port to 8090, and binary (symlink) and service configuration file locations to Debian defaults:

    diff --git a/cluster/centos/master/scripts/apiserver.sh b/cluster/centos/master/scripts/apiserver.sh
    index 6b7b1c2..62d24fd 100755
    --- a/cluster/centos/master/scripts/apiserver.sh
    +++ b/cluster/centos/master/scripts/apiserver.sh
    @@ -43,8 +43,8 @@ KUBE_ETCD_KEYFILE="--etcd-keyfile=/srv/kubernetes/etcd/client-key.pem"
     # --insecure-bind-address=127.0.0.1: The IP address on which to serve the --insecure-port.
     KUBE_API_ADDRESS="--insecure-bind-address=0.0.0.0"
     
    -# --insecure-port=8080: The port on which to serve unsecured, unauthenticated access.
    -KUBE_API_PORT="--insecure-port=8080"
    +# --insecure-port=8090: The port on which to serve unsecured, unauthenticated access.
    +KUBE_API_PORT="--insecure-port=8090"
     
     # --kubelet-port=10250: Kubelet port
     NODE_PORT="--kubelet-port=10250"
    @@ -101,7 +101,7 @@ KUBE_APISERVER_OPTS="   \${KUBE_LOGTOSTDERR}         \\
                             \${KUBE_API_TLS_PRIVATE_KEY_FILE}"
     
     
    -cat <<EOF >/usr/lib/systemd/system/kube-apiserver.service
    +cat <<EOF >/lib/systemd/system/kube-apiserver.service
     [Unit]
     Description=Kubernetes API Server
     Documentation=https://github.com/kubernetes/kubernetes
    diff --git a/cluster/centos/master/scripts/controller-manager.sh b/cluster/centos/master/scripts/controller-manager.sh
    index 3025d06..5aa0f12 100755
    --- a/cluster/centos/master/scripts/controller-manager.sh
    +++ b/cluster/centos/master/scripts/controller-manager.sh
    @@ -20,7 +20,7 @@ MASTER_ADDRESS=${1:-"8.8.8.18"}
     cat <<EOF >/opt/kubernetes/cfg/kube-controller-manager
     KUBE_LOGTOSTDERR="--logtostderr=true"
     KUBE_LOG_LEVEL="--v=4"
    -KUBE_MASTER="--master=${MASTER_ADDRESS}:8080"
    +KUBE_MASTER="--master=${MASTER_ADDRESS}:8090"
     
     # --root-ca-file="": If set, this root certificate authority will be included in
     # service account's token secret. This must be a valid PEM-encoded CA bundle.
    @@ -41,7 +41,7 @@ KUBE_CONTROLLER_MANAGER_OPTS="  \${KUBE_LOGTOSTDERR} \\
                                     \${KUBE_CONTROLLER_MANAGER_SERVICE_ACCOUNT_PRIVATE_KEY_FILE}\\
                                     \${KUBE_LEADER_ELECT}"
     
    -cat <<EOF >/usr/lib/systemd/system/kube-controller-manager.service
    +cat <<EOF >/lib/systemd/system/kube-controller-manager.service
     [Unit]
     Description=Kubernetes Controller Manager
     Documentation=https://github.com/kubernetes/kubernetes
    diff --git a/cluster/centos/master/scripts/etcd.sh b/cluster/centos/master/scripts/etcd.sh
    index aa73b57..34eff5c 100755
    --- a/cluster/centos/master/scripts/etcd.sh
    +++ b/cluster/centos/master/scripts/etcd.sh
    @@ -64,7 +64,7 @@ ETCD_PEER_CERT_FILE="/srv/kubernetes/etcd/peer-${ETCD_NAME}.pem"
     ETCD_PEER_KEY_FILE="/srv/kubernetes/etcd/peer-${ETCD_NAME}-key.pem"
     EOF
     
    -cat <<EOF >//usr/lib/systemd/system/etcd.service
    +cat <<EOF >/lib/systemd/system/etcd.service
     [Unit]
     Description=Etcd Server
     After=network.target
    diff --git a/cluster/centos/master/scripts/flannel.sh b/cluster/centos/master/scripts/flannel.sh
    index 092fcd8..21e2bbe 100644
    --- a/cluster/centos/master/scripts/flannel.sh
    +++ b/cluster/centos/master/scripts/flannel.sh
    @@ -30,7 +30,7 @@ FLANNEL_ETCD_CERTFILE="--etcd-certfile=${CERT_FILE}"
     FLANNEL_ETCD_KEYFILE="--etcd-keyfile=${KEY_FILE}"
     EOF
     
    -cat <<EOF >/usr/lib/systemd/system/flannel.service
    +cat <<EOF >/lib/systemd/system/flannel.service
     [Unit]
     Description=Flanneld overlay address etcd agent
     After=network.target
    diff --git a/cluster/centos/master/scripts/scheduler.sh b/cluster/centos/master/scripts/scheduler.sh
    index 1a68d71..3b444bf 100755
    --- a/cluster/centos/master/scripts/scheduler.sh
    +++ b/cluster/centos/master/scripts/scheduler.sh
    @@ -27,7 +27,7 @@ KUBE_LOGTOSTDERR="--logtostderr=true"
     # --v=0: log level for V logs
     KUBE_LOG_LEVEL="--v=4"
     
    -KUBE_MASTER="--master=${MASTER_ADDRESS}:8080"
    +KUBE_MASTER="--master=${MASTER_ADDRESS}:8090"
     
     # --leader-elect
     KUBE_LEADER_ELECT="--leader-elect"
    @@ -43,7 +43,7 @@ KUBE_SCHEDULER_OPTS="   \${KUBE_LOGTOSTDERR}     \\
                             \${KUBE_LEADER_ELECT}    \\
                             \${KUBE_SCHEDULER_ARGS}"
     
    -cat <<EOF >/usr/lib/systemd/system/kube-scheduler.service
    +cat <<EOF >/lib/systemd/system/kube-scheduler.service
     [Unit]
     Description=Kubernetes Scheduler
     Documentation=https://github.com/kubernetes/kubernetes
    diff --git a/cluster/centos/node/bin/mk-docker-opts.sh b/cluster/centos/node/bin/mk-docker-opts.sh
    index 041d977..177ee9f 100755
    --- a/cluster/centos/node/bin/mk-docker-opts.sh
    +++ b/cluster/centos/node/bin/mk-docker-opts.sh
    @@ -69,7 +69,6 @@ done
     
     if [[ $indiv_opts = false ]] && [[ $combined_opts = false ]]; then
       indiv_opts=true
    -  combined_opts=true
     fi
     
     if [[ -f "$flannel_env" ]]; then
    diff --git a/cluster/centos/node/scripts/docker.sh b/cluster/centos/node/scripts/docker.sh
    index 320446a..b0312fc 100755
    --- a/cluster/centos/node/scripts/docker.sh
    +++ b/cluster/centos/node/scripts/docker.sh
    @@ -20,10 +20,10 @@ DOCKER_OPTS=${1:-""}
     DOCKER_CONFIG=/opt/kubernetes/cfg/docker
     
     cat <<EOF >$DOCKER_CONFIG
    -DOCKER_OPTS="-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock -s overlay --selinux-enabled=false ${DOCKER_OPTS}"
    +DOCKER_OPTS="-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock -s aufs --selinux-enabled=false ${DOCKER_OPTS}"
     EOF
     
    -cat <<EOF >/usr/lib/systemd/system/docker.service
    +cat <<EOF >/lib/systemd/system/docker.service
     [Unit]
     Description=Docker Application Container Engine
     Documentation=http://docs.docker.com
    @@ -35,7 +35,7 @@ Type=notify
     EnvironmentFile=-/run/flannel/docker
     EnvironmentFile=-/opt/kubernetes/cfg/docker
     WorkingDirectory=/opt/kubernetes/bin
    -ExecStart=/opt/kubernetes/bin/dockerd \$DOCKER_OPT_BIP \$DOCKER_OPT_MTU \$DOCKER_OPTS
    +ExecStart=/usr/bin/dockerd \$DOCKER_OPT_BIP \$DOCKER_OPT_MTU \$DOCKER_OPTS
     LimitNOFILE=1048576
     LimitNPROC=1048576
     
    diff --git a/cluster/centos/node/scripts/flannel.sh b/cluster/centos/node/scripts/flannel.sh
    index 2830dae..a927bb2 100755
    --- a/cluster/centos/node/scripts/flannel.sh
    +++ b/cluster/centos/node/scripts/flannel.sh
    @@ -30,7 +30,7 @@ FLANNEL_ETCD_CERTFILE="--etcd-certfile=${CERT_FILE}"
     FLANNEL_ETCD_KEYFILE="--etcd-keyfile=${KEY_FILE}"
     EOF
     
    -cat <<EOF >/usr/lib/systemd/system/flannel.service
    +cat <<EOF >/lib/systemd/system/flannel.service
     [Unit]
     Description=Flanneld overlay address etcd agent
     After=network.target
    diff --git a/cluster/centos/node/scripts/kubelet.sh b/cluster/centos/node/scripts/kubelet.sh
    index 323a03e..4c93015 100755
    --- a/cluster/centos/node/scripts/kubelet.sh
    +++ b/cluster/centos/node/scripts/kubelet.sh
    @@ -39,7 +39,7 @@ NODE_HOSTNAME="--hostname-override=${NODE_ADDRESS}"
     
     # --api-servers=[]: List of Kubernetes API servers for publishing events,
     # and reading pods and services. (ip:port), comma separated.
    -KUBELET_API_SERVER="--api-servers=${MASTER_ADDRESS}:8080"
    +KUBELET_API_SERVER="--api-servers=${MASTER_ADDRESS}:8090"
     
     # --allow-privileged=false: If true, allow containers to request privileged mode. [default=false]
     KUBE_ALLOW_PRIV="--allow-privileged=false"
    @@ -63,7 +63,7 @@ KUBE_PROXY_OPTS="   \${KUBE_LOGTOSTDERR}     \\
                         \${KUBELET_DNS_DOMAIN}      \\
                         \${KUBELET_ARGS}"
     
    -cat <<EOF >/usr/lib/systemd/system/kubelet.service
    +cat <<EOF >/lib/systemd/system/kubelet.service
     [Unit]
     Description=Kubernetes Kubelet
     After=docker.service
    diff --git a/cluster/centos/node/scripts/proxy.sh b/cluster/centos/node/scripts/proxy.sh
    index 584987b..1f365fb 100755
    --- a/cluster/centos/node/scripts/proxy.sh
    +++ b/cluster/centos/node/scripts/proxy.sh
    @@ -29,7 +29,7 @@ KUBE_LOG_LEVEL="--v=4"
     NODE_HOSTNAME="--hostname-override=${NODE_ADDRESS}"
     
     # --master="": The address of the Kubernetes API server (overrides any value in kubeconfig)
    -KUBE_MASTER="--master=http://${MASTER_ADDRESS}:8080"
    +KUBE_MASTER="--master=http://${MASTER_ADDRESS}:8090"
     EOF
     
     KUBE_PROXY_OPTS="   \${KUBE_LOGTOSTDERR} \\
    @@ -37,7 +37,7 @@ KUBE_PROXY_OPTS="   \${KUBE_LOGTOSTDERR} \\
                         \${NODE_HOSTNAME}    \\
                         \${KUBE_MASTER}"
     
    -cat <<EOF >/usr/lib/systemd/system/kube-proxy.service
    +cat <<EOF >/lib/systemd/system/kube-proxy.service
     [Unit]
     Description=Kubernetes Proxy
     After=network.target
    diff --git a/cluster/centos/util.sh b/cluster/centos/util.sh
    index 88302a3..dbb3ca5 100755
    --- a/cluster/centos/util.sh
    +++ b/cluster/centos/util.sh
    @@ -136,7 +136,7 @@ function kube-up() {
     
       # set CONTEXT and KUBE_SERVER values for create-kubeconfig() and get-password()
       export CONTEXT="centos"
    -  export KUBE_SERVER="http://${MASTER_ADVERTISE_ADDRESS}:8080"
    +  export KUBE_SERVER="http://${MASTER_ADVERTISE_ADDRESS}:8090"
       source "${KUBE_ROOT}/cluster/common.sh"
     
       # set kubernetes user and password
    @@ -199,7 +199,7 @@ function troubleshoot-node() {
     function tear-down-master() {
     echo "[INFO] tear-down-master on $1"
       for service_name in etcd kube-apiserver kube-controller-manager kube-scheduler ; do
    -      service_file="/usr/lib/systemd/system/${service_name}.service"
    +      service_file="/lib/systemd/system/${service_name}.service"
           kube-ssh "$1" " \
             if [[ -f $service_file ]]; then \
               sudo systemctl stop $service_name; \
    @@ -217,7 +217,7 @@ echo "[INFO] tear-down-master on $1"
     function tear-down-node() {
     echo "[INFO] tear-down-node on $1"
       for service_name in kube-proxy kubelet docker flannel ; do
    -      service_file="/usr/lib/systemd/system/${service_name}.service"
    +      service_file="/lib/systemd/system/${service_name}.service"
           kube-ssh "$1" " \
             if [[ -f $service_file ]]; then \
               sudo systemctl stop $service_name; \
    

With these changes in place, installing K8s on my machine was straightforward, which included simply running this command from within the cluster directory of the patched source:

MASTER=janaka@10.0.0.1 \
NODES=janaka@10.0.0.1 \
DOCKER_OPTS="--insecure-registry=hub.adroitlogic.com:5000" \
KUBERNETES_PROVIDER=centos \
CERT_GROUP=janaka \
./kube-up.sh
  1. Because the certificate generation process has problems with localhost or 127.0.0.1 I had to use a static IP (10.0.0.1) assigned to one of my network interfaces.
  2. I have one master and one worker node, both of which are my own machine itself.
  3. janaka is the username on my local machine.
  4. We have a local Docker hub for holding our IPS images, at hub.adroitlogic.com:5000 (resolved via internal hostname mappings), which we need to inject to the Docker startup script (in order to be able to utilize our own images within the future IPS set-up).

Takeaway?

  • learned a bit about how K8s components fit together,
  • added quite a bit of shell scripting tips and tricks to my book of knowledge,
  • had my 101 on [Linux services (systemd)],
  • skipped one lunch,
  • fought several hours of frustration,
  • and finally felt a truckload of satisfaction, which flushed out the frustration without a trace.

Friday, September 8, 2017

Gracefully Shutting Down Java in Containers: Why You Should Double-Check!

Gracefulness is not only an admirable human quality: it is also a must-have for any application program, especially when it is heaving the burden of mission-critical domains.

UltraESB has had a good history of maintaining gracefulness throughout its runtime, including shutdown. The new UltraESB-X honoured the tradition and implemented graceful shutdown in its 17.07 release.

When we composed the ips-worker Docker image for our Integration Platform (IPS) as a tailored version of UltraESB-X, we could guarantee that ESBs running in the platform would shut down gracefully--or so we thought.

Unfortunately not.

As soon as we redeploy or change the replication count of a cluster, all ESB instances running under the cluster would terminate (and new instances get spawned to take their place). The termination is supposed to be graceful; the ESBs would first stop accepting any new incoming messages, and hold off the internal shutdown sequence for a few seconds until processing of in-flight messages gets completed, or a timeout ends the holdoff.

On our Kubernetes-based mainstream IPS release, we retrieve logs of ESB instances (pods) via the K8s API as well as via a database appender so that we can analyze them later. In analyzing the logs, we noticed that we were never seeing any ESB shutdown logs, no matter how big the log store has grown. It was as if the ESBs were getting brutally killed as soon as the termination signal was received.

To investigate the issue, I started off with a simplified Java program: one that registers a shutdown hook--the world-famous way of implementing graceful shutdown in Java, which we had utilized in both our ESBs--and keeps running forever, printing some text periodically (to indicate the main thread is active). As soon as the shutdown hook is triggered, I interrupted the main thread, changed the output to indicate that we are shutting down, and let the handler finish after a few seconds (consistent with a "mock" graceful shutdown).

class Kill {

	private static Thread main;

	public static void main(String[] a) throws Exception {

		Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
			public void run() {
				System.out.println("TERM");
				main.interrupt();
				for (int i = 0; i < 4; i++) {
					System.out.println("busy");
					try {
						Thread.sleep(1000);
					} catch (Exception e) {}
				}
				System.out.println("exit");
			}
		}));

		main = Thread.currentThread();
		while (true) {
			Thread.sleep(1000);
			System.out.println("run");
		}
	}
}

Testing it is pretty easy:

javac Kill.java
java Kill

While the program keeps on printing:

run
run
run
...

press Ctrl+C to see what happens:

...
run
run
^CTERM
busy
Exception in thread "main" java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at Kill.main(Kill.java:22)
busy
busy
busy
exit

Looks good.

That done, converting this into a fully-fledged Docker container took only a few minutes, and the following Dockerfile:

FROM openjdk:8-jre-alpine
ADD Kill*.class /
ENTRYPOINT ["java", "Kill"]

docker build -t kill:v1 .

Next I ran a container with the new image:

docker run -it --rm kill:v1

which gave the expected output:

run
run
run
...

Then I sent a TERM signal (which maps to Ctrl+C in normal jargon, and is the default trigger for Java's shutdown hook) to the process, using the kill command:

# pardon the fancy functions;
# they are quite useful for me when dealing with processes

function pid() {
    ps -ef | grep $1 | grep -v grep | awk '{print $2}'
}

function killsig() {
    for i in pid $2; do
        sudo kill $1 $i
    done
}

alias termit='killsig -15'

# with all the above in place, I just have to run:
termit Kill

As expected, the shutdown hook got invoked and executed smoothly.

Going a step further, I made the whole thing into a standalone K8s pod (backed by a single-replica Deployment):

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kill
spec:
  selector:
    matchLabels:
      k8s-app: kill
  template:
    metadata:
      labels:
        k8s-app: kill
    spec:
      containers:
      - name: kill
        image: kill:v1

and tried out the same thing, this time by zeroing-out spec.replicas (same as we do it in IPS) via the kubectl edit deployment command, instead of a manual kill -TERM:

kubectl edit deployment kill

# vi is my default editor
# set "replicas" to 0 (line 20 in my case)
# <ESC>:wq<ENTER>

while having a console tail of the pod in a separate window:

# fancy stuff again

function findapp() {
    kubectl get pod -l k8s-app=$1 -oname | cut -b 6-;
}

function klog() {
    kubectl logs -f findapp $1;
}

# the final command
klog kill

showing the output:

run
run
...
run
TERM
busy
Exception in thread "main" java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at Kill.main(Kill.java:22)
busy
busy
busy
exit

Damn, it still shuts down gracefully!

So what's wrong with my ips-worker?

Just to verify, I got a single-replica cluster running on IPS, manually changed the image (spec.template.spec.containers[0].image) and startup command (spec.template.spec.containers[0].command) of the K8s deployment via kubectl edit (keeping all other factors--such as environmental variables and volume mounts--unchanged), and tried out the same zero-out sequence;

Same result! Graceful shutdown!

Then it occurred to me that, while my kill container simply uses a java Kill command, ips-worker uses a bit more complicated command:

/bin/sh -c <copy some files> && <run some custom command> && <run ultraesb-x.sh>

where, in the last part, we construct (with a specially fabricated classpath, and some JVM parameters) and execute a pretty long java command that starts up the UltraESB-X beast.

So ultimately, the final live command in the container boils down to:

/bin/sh -c <basepath>/ultraesb-x.sh

Hence I tried a shell command on my kill container, by slightly changing the Dockerfile:

by slightly changing the Dockerfile:

FROM openjdk:8-jre-alpine
ADD Kill*.class /
# note the missing brackets and quotes, so that the command gets the default /bin/sh -c prefix
ENTRYPOINT java Kill

and yay! Graceful shutdown was no more. The Java process got killed brutally, on Docker (docker stop) as well as in K8s (replica zero-out).

Investigating further, I was guided by Google to this popular SE post which basically said that the shell (sh) does not pass received signals to its child processes by default. The suggested alternative was to run the internal command as an exec which would basically replace the parent process (sh) with the child (java, in case of kill):

FROM openjdk:8-jre-alpine
ADD Kill*.class /
ENTRYPOINT exec java Kill

For kill, that did the trick right away.

For ips-worker things were a bit different, as there were two levels of invocation: the container's command invoking a chain of commands via /bin/sh -c, and the built-in ultraesb-x.sh invoking the ultimate java command. Hence I had to include exec at two places:

Once at the end of the command chain:

/bin/sh -c \
<copy some files> && \
<run some custom command> && \
exec <basepath>/ultraesb-x.sh

And again at the end of ultraesb-x.sh:

# do some magic to compose the classpath and other info for ESB startup

exec $JAVA_HOME/bin/java <classpath and other params>

Simple as it may seem, those two execs were enough to bring back graceful shutdown to ips-worker, and hence to our Integration Platform.

Tuesday, September 5, 2017

Kubernetes 1.7 on CentOS: This is How We Nailed It!

Please Note: If you are looking for a comprehensive, step-by-step guide on installing Kubernetes on CentOS, check out this guide (composed by one of my colleagues) instead.

Caution: This blog was originally composed several months ago, when the K8s release cycle was at v1.7.0-alpha.3. As always, K8s has zoomed past us, hence most of the content would be downright obsolete by now!


If you are a fan of Kubernetes (K8s) and the ton of amazing features it has in store for developers, you would surely agree with me on the fact that getting a local K8s setup up and running is one of the most fundamental steps of learning K8s.

Since we have been waiting for quite some time to re-spin our Integration Platform (IPS) product on top of a new K8s version, we decided it was high time to get a brand new cluster up and running with 1.7.0 (which is still in beta as of this writing, but would become stable by the time we migrate our own code on top of it). Due to some stability requirements we wanted a setup running a RedHat-compatible base OS, so we decided to go ahead with CentOS.

While we could have always have followed one of the pre-compiled guides from K8s docs, such as this, we wanted to be a bit more adventurous and get things done by way of the K8s source; besides, we wanted to be able to upgrade our set-up when required, for which the best way was to properly understand how the K8s authors have nailed it in the original source.

First of all, we got hold of a clone of the K8s source at the branch master branch (as we needed the latest code base) and tag v1.7.0-alpha.3 (since we were looking for that specific release).

Next we built the K8s binary artifacts using the make command at sources root. (You can utilize the Docker-based cross compilation approach mentioned in the above guide as well, but note that it would download a 1.8+ GB cross compiler image and build artifacts for all platforms resulting in a 20+ GB total size.)

We set up a VirtualBox virtual machine running CentOS 7, and installed docker, etcd and flanneld afresh. Docker was installed via yum (sudo yum install docker) and configured for use by our default, non-root user. Others were installed via yum as well, with versions etcd-3.1.3-1.el7.x86_64.rpm and flannel-0.7.0-1.el7.x86_64.rpm, respectively.

We were hoping to clone the machine in order to obtain a multi-node set-up (without undertaking the overhead of installing the individual components on each machine separately). However, this seemingly complicated the whole process.

The default K8s installer for CentOS (located at cluster/centos) by default downloads archives of K8s client/server, etcd, flanneld and docker binaries as part of the configuration process. Since we needed to install K8s binaries from our custom build instead, and didn't want to download the other binaries at all (as they were already installed), we had to customize the corresponding build and deployment scripts to avoid downloading the artifacts and utilize the build artifacts instead, and to control the reconfiguration of etcd, flanneld and docker.

What follows is a breakdown of the changes we had to perform on top of the v1.7.0-alpha.3 tag to get things working with native (yum-based) installations of etcd, flanneld and docker, with some attempts to explain the rationale behind each change.

Tip: If you wish to utilize any of these patches, it should be possible to save them into a plaintext file (with or without the .patch extension) and apply them on top of v1.7.0-alpha.3 tag of the K8s source using the git apply command.

Tip: If you only have a subset of the above services (etcd, flanneld and docker) already installed, patch out only the appropriate files/sections.

  1. Avoiding download of binary artifacts:
    diff --git a/cluster/centos/build.sh b/cluster/centos/build.sh
    index 18bbe6f..09a3631 100755
    --- a/cluster/centos/build.sh
    +++ b/cluster/centos/build.sh
    @@ -42,19 +42,6 @@ function clean-up() {
     function download-releases() {
       rm -rf ${RELEASES_DIR}
       mkdir -p ${RELEASES_DIR}
    -
    -  echo "Download flannel release v${FLANNEL_VERSION} ..."
    -  curl -L ${FLANNEL_DOWNLOAD_URL} -o ${RELEASES_DIR}/flannel.tar.gz
    -
    -  echo "Download etcd release v${ETCD_VERSION} ..."
    -  curl -L ${ETCD_DOWNLOAD_URL} -o ${RELEASES_DIR}/etcd.tar.gz
    -
    -  echo "Download kubernetes release v${K8S_VERSION} ..."
    -  curl -L ${K8S_CLIENT_DOWNLOAD_URL} -o ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz
    -  curl -L ${K8S_SERVER_DOWNLOAD_URL} -o ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz
    -
    -  echo "Download docker release v${DOCKER_VERSION} ..."
    -  curl -L ${DOCKER_DOWNLOAD_URL} -o ${RELEASES_DIR}/docker.tar.gz
     }
     
     function unpack-releases() {
    @@ -80,19 +67,12 @@ function unpack-releases() {
       fi
     
       # k8s
    -  if [[ -f ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz ]] ; then
    -    tar xzf ${RELEASES_DIR}/kubernetes-client-linux-amd64.tar.gz -C ${RELEASES_DIR}
         cp ${RELEASES_DIR}/kubernetes/client/bin/kubectl ${BINARY_DIR}
    -  fi
    -
    -  if [[ -f ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz ]] ; then
    -    tar xzf ${RELEASES_DIR}/kubernetes-server-linux-amd64.tar.gz -C ${RELEASES_DIR}
         cp ${RELEASES_DIR}/kubernetes/server/bin/kube-apiserver \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-controller-manager \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-scheduler ${BINARY_DIR}/master/bin
         cp ${RELEASES_DIR}/kubernetes/server/bin/kubelet \
            ${RELEASES_DIR}/kubernetes/server/bin/kube-proxy ${BINARY_DIR}/node/bin
    -  fi
     
       # docker
       if [[ -f ${RELEASES_DIR}/docker.tar.gz ]]; then
    
  2. Avoiding verification of artifacts (since we no longer download them):
    diff --git a/cluster/kube-up.sh b/cluster/kube-up.sh
    index 7877fb9..9e793ce 100755
    --- a/cluster/kube-up.sh
    +++ b/cluster/kube-up.sh
    @@ -40,8 +40,6 @@ fi
     
     echo "... calling verify-prereqs" >&2
     verify-prereqs
    -echo "... calling verify-kube-binaries" >&2
    -verify-kube-binaries
     
     if [[ "${KUBE_STAGE_IMAGES:-}" == "true" ]]; then
       echo "... staging images" >&2
    
  3. Coping for natively-installed etcd and flanneld binaries (which are located (rather symlinked) at /usr/bin instead of /opt/kubernetes/bin where K8s expects them to be):
    diff --git a/cluster/centos/master/scripts/etcd.sh b/cluster/centos/master/scripts/etcd.sh
    index aa73b57..6b575da 100755
    --- a/cluster/centos/master/scripts/etcd.sh
    +++ b/cluster/centos/master/scripts/etcd.sh
    @@ -74,7 +74,7 @@ Type=simple
     WorkingDirectory=${etcd_data_dir}
     EnvironmentFile=-/opt/kubernetes/cfg/etcd.conf
     # set GOMAXPROCS to number of processors
    -ExecStart=/bin/bash -c "GOMAXPROCS=\$(nproc) /opt/kubernetes/bin/etcd"
    +ExecStart=/bin/bash -c "GOMAXPROCS=\$(nproc) /usr/bin/etcd"
     Type=notify
     
     [Install]
    diff --git a/cluster/centos/master/scripts/flannel.sh b/cluster/centos/master/scripts/flannel.sh
    index 092fcd8..5d9630d 100644
    --- a/cluster/centos/master/scripts/flannel.sh
    +++ b/cluster/centos/master/scripts/flannel.sh
    @@ -37,7 +37,7 @@ After=network.target
     
     [Service]
     EnvironmentFile=-/opt/kubernetes/cfg/flannel
    -ExecStart=/opt/kubernetes/bin/flanneld --ip-masq \${FLANNEL_ETCD} \${FLANNEL_ETCD_KEY} \${FLANNEL_ETCD_CAFILE} \${FLANNEL_ETCD_CERTFILE} \${FLANNEL_ETCD_KEYFILE}
    +ExecStart=/usr/bin/flanneld --ip-masq \${FLANNEL_ETCD} \${FLANNEL_ETCD_KEY} \${FLANNEL_ETCD_CAFILE} \${FLANNEL_ETCD_CERTFILE} \${FLANNEL_ETCD_KEYFILE}
     
     Type=notify
     
    @@ -48,7 +48,7 @@ EOF
     # Store FLANNEL_NET to etcd.
     attempt=0
     while true; do
    -  /opt/kubernetes/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
    +  /usr/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
         --no-sync -C ${ETCD_SERVERS} \
         get /coreos.com/network/config >/dev/null 2>&1
       if [[ "$?" == 0 ]]; then
    @@ -59,7 +59,7 @@ while true; do
           exit 2
         fi
     
    -    /opt/kubernetes/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
    +    /usr/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
           --no-sync -C ${ETCD_SERVERS} \
           mk /coreos.com/network/config "{\"Network\":\"${FLANNEL_NET}\"}" >/dev/null 2>&1
         attempt=$((attempt+1))
    diff --git a/cluster/centos/node/bin/mk-docker-opts.sh b/cluster/centos/node/bin/mk-docker-opts.sh
    index 041d977..177ee9f 100755
    --- a/cluster/centos/node/bin/mk-docker-opts.sh
    +++ b/cluster/centos/node/bin/mk-docker-opts.sh
    @@ -69,7 +69,6 @@ done
     
     if [[ $indiv_opts = false ]] && [[ $combined_opts = false ]]; then
       indiv_opts=true
    -  combined_opts=true
     fi
     
     if [[ -f "$flannel_env" ]]; then
    diff --git a/cluster/centos/node/scripts/docker.sh b/cluster/centos/node/scripts/docker.sh
    index 320446a..3f38f3e 100755
    --- a/cluster/centos/node/scripts/docker.sh
    +++ b/cluster/centos/node/scripts/docker.sh
    @@ -35,7 +35,7 @@ Type=notify
     EnvironmentFile=-/run/flannel/docker
     EnvironmentFile=-/opt/kubernetes/cfg/docker
     WorkingDirectory=/opt/kubernetes/bin
    -ExecStart=/opt/kubernetes/bin/dockerd \$DOCKER_OPT_BIP \$DOCKER_OPT_MTU \$DOCKER_OPTS
    +ExecStart=/usr/bin/dockerd \$DOCKER_OPT_BIP \$DOCKER_OPT_MTU \$DOCKER_OPTS
     LimitNOFILE=1048576
     LimitNPROC=1048576
     
    diff --git a/cluster/centos/node/scripts/flannel.sh b/cluster/centos/node/scripts/flannel.sh
    index 2830dae..384788f 100755
    --- a/cluster/centos/node/scripts/flannel.sh
    +++ b/cluster/centos/node/scripts/flannel.sh
    @@ -39,7 +39,7 @@ Before=docker.service
     [Service]
     EnvironmentFile=-/opt/kubernetes/cfg/flannel
     ExecStartPre=/opt/kubernetes/bin/remove-docker0.sh
    -ExecStart=/opt/kubernetes/bin/flanneld --ip-masq \${FLANNEL_ETCD} \${FLANNEL_ETCD_KEY} \${FLANNEL_ETCD_CAFILE} \${FLANNEL_ETCD_CERTFILE} \${FLANNEL_ETCD_KEYFILE}
    +ExecStart=/usr/bin/flanneld --ip-masq \${FLANNEL_ETCD} \${FLANNEL_ETCD_KEY} \${FLANNEL_ETCD_CAFILE} \${FLANNEL_ETCD_CERTFILE} \${FLANNEL_ETCD_KEYFILE}
     ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -d /run/flannel/docker
     
     Type=notify
    @@ -52,7 +52,7 @@ EOF
     # Store FLANNEL_NET to etcd.
     attempt=0
     while true; do
    -  /opt/kubernetes/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
    +  /usr/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
         --no-sync -C ${ETCD_SERVERS} \
         get /coreos.com/network/config >/dev/null 2>&1
       if [[ "$?" == 0 ]]; then
    @@ -63,7 +63,7 @@ while true; do
           exit 2
         fi
     
    -    /opt/kubernetes/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
    +    /usr/bin/etcdctl --ca-file ${CA_FILE} --cert-file ${CERT_FILE} --key-file ${KEY_FILE} \
           --no-sync -C ${ETCD_SERVERS} \
           mk /coreos.com/network/config "{\"Network\":\"${FLANNEL_NET}\"}" >/dev/null 2>&1
         attempt=$((attempt+1))
    

Once all fixes were in place, all we needed to do to get the ball rolling, was to run

MASTER=adrt@192.168.1.192 \
NODES="adrt@192.168.1.192 adrt@192.168.1.193 adrt@192.168.1.194" \
DOCKER_OPTS="--insecure-registry=hub.adroitlogic.com:5000" \
KUBERNETES_PROVIDER=centos \
CERT_GROUP=janaka \
./kube-up.sh

within the cluster directory of the K8s source.

  • Our master node is 192.168.1.192.
  • We have 3 worker nodes, 192.168.1.192, 192.168.1.193 and 192.168.1.194.
  • Our master node itself is a worker node (since we often don't have enough resources on our machines, to run a dedicated master).
  • adrt is the CentOS user on each node (with superuser privileges).
  • janaka is the username on my local machine (where the kube-up.sh script actually gets executed).
  • We have a local Docker hub for holding our IPS images, at hub.adroitlogic.com:5000 (resolved via internal hostname mappings), which we need to inject to the Docker startup script (in order to be able to utilize our own images within the future IPS set-up).

And within minutes, we got a working K8s cluster!


P.S.: All this happened a long time ago, and we utilized the cluster in developing and testing our Integration Platform (IPS), whose latest 17.07 release is now available for download. Check it out; you may happen to like it!

Thursday, April 13, 2017

A Threesome on the Linux Kernel: Intel OpenCL r4.0, VirtualBox 5.0.18 and aufs4 (Docker 1.12.3), all on Kernel 4.7.0-040700-generic

Since I have a small GPU on my HP Envy 15t-ae100 running Ubuntu 16.04, the idea of getting it set up as a GPGPU device has been lingering in my to-do list for quite some time. While Intel's website claimed that it could be done with an OpenCL compatibility kernel patch, I was a bit reluctant to follow it in fear of breaking my hard-built system, containing applications and projects set up for my workplace environment, set up over a course of several months.

Finally, one weekend I made a decision to try it all out. After making a backup of all my data, I went ahead and started off with Intel's PDF guide. Happy to say, everything went perfectly and smoothly; there were no detours, and all instructions worked like a charm.

Once I rebooted after installing the patched kernel and fired up BOINC Manager, it happily reported 1 OpenCL GPU (yay!) and 1 OpenCL CPU with 0.00589 compute units, meaning that the real power of the GPU has been exposed by the OpenCL platform :). In addition, I also tried the OpenCL capability reporter: ZIP source and GPU quicksort sample OpenCL programs (both provided by Intel) and they too worked perfectly, reporting a GPU with 24 cores, supporting OpenCL 2.0 Full Profile.

All was going fine until I fired up VirtualBox and tried to power up a VM. The VM failed to start and I was greeted with an error indicating failure of the vboxdrv module. sudo dpkg-reconfigure virtualbox, suggested by online sources, also failed to rectify the issue, as compiling of the kernel module was failing for the newly installed 4.7.0 kernel. Evidently the installed source of my VirtualBox version (5.0.18) was somehow incompatible with the new kernel.

Problems continued to pile up. The wireless module (bcmwl-kernel-source) that I had installed earlier, refused to work under the new kernel. The wireless interface was no longer working, and there was no way to connect to the workplace Wi-Fi network.

As I work frequently with Docker and K8s, I naturally got a feeling that I should be checking up on them as well. I discovered that Docker was failing to start, spitting out an error [graphdriver] prior storage driver "aufs" failed, rendering the K8s stack was pretty much useless (for dev testing I use a full stack of Docker and K8s (both master and minion) on my machine).

Luckily I had not purged my original kernel, and as a result, I could get the Wi-Fi and Docker issues resolved by restarting the system with the old kernel (via manual selection using Advanced options for Ubuntu entry on the GRUB menu). The VirtualBox issue persisted until I performed another sudo dpkg-reconfigure virtualbox which recompiled the vboxdrv module for the older kernel (still failing on the newer one but getting successfully installed for the older one).

Once I got myself up and running with the old kernel, I began investigating the issues one by one. The VirtualBox issue was easy, as a patch had already been provided by a generous member of the dev team soon after the 4.7.0 kernel release. I simply had to locate the vboxdrv kernel source location using the log generated during dpkg-reconfigure, apply the patch on top of it, and issue another dpkg-reconfigure to see the module getting compiled successfully for both installed kernels.

The wireless issue was also fairly easy to solve, thanks to this AskUbuntu post. I just had to replace the bcmwl-kernel-source module with broadcom-sta-dkms 6.30.223.271-3, and things started working smoothly again (it even appeared that the wireless "stability" had increased, as some of the connectivity issues I had been consistently experiencing at certain locations seemed to have disappeared).

The Docker issue seemed to be a little bit trickier. While posts like this GitHub issue were placing the blame on missing linux-image-extra packages, I could not find such packages for my 4.7.0 kernel. After a fair amount of wandering around, I learned that separate linux-image-extra packages had not been released for 4.7.0.

The word aufs on the Docker startup error led me to search further on aufs, which carried me over to this guide on unofficially adding it to the kernel as a patch. However, I was not very confident with the approach as I had to be applying the patched on top of an already-patched source (with OpenCL-related changes). Anyway, I made up my mind in the end:

  • cloning aufs-util with --depth 1 and -b aufs4.1 (closest applicable version for kernel 4.7)
  • cloning aufs4-standalone with --depth 1 and -b aufs4.7
  • patching the 4.7 kernel source (which had already been patched for Intel OpenCL and VirtualBox), following instructions under method 1 of section 3 (Configuration and Compilation) of the guide
  • rebuilding and reinstalling the kernel

And it worked!

Now I am happily using VirtualBox, Docker and OpenCL with GPU support, all on my own machine.

Freedom of Blogging: Write, Publish and Distribute, all in Raw AsciiDoc!

AsciiDoc (AD) being one of the easiest ways to write rich-text markups as plaintext sources, we see a trend of bloggers moving to AD-based platforms rather than HTML-based ones. This brings the great advantage of ubiquitous blogging, as the hand-typed source can often be published without additional formatting, in addition to the elimination of most of the "garbage" HTML introduced by rich-text editors.

Unfortunately, since browsers understand only HTML, viewing the blog ultimately requires conversion of the aforementioned AD source to HTML at some point. Existing AD blogging platforms do this on server-side, meaning that what you ultimately receive on the browser is not AD but bloated-up HTML.

Given the widespread support for Javascript in almost all modern browsers, we can instead load the AD source directly on the client side, and parse it into HTML using Javascript. In addition to reducing the loading overhead, this eliminates the need for a dedicated backend or a conversion process for transforming the publisher-submitted AD content to HTML.

As none of the existing platforms seemed to support this client-side blog parsing ability, I decided to implement a simple PoC on my own. You can clone it from here and try it out.

In order to publish a blog post using it, you can simply upload the AD source to the hosted content location and update menu.json to include the new post (the entry key and value set to the relative path and title of the post respectively). Thereafter, when the root page (index.html) is loaded in a browser, a list of articles (defined in menu.json) will be displayed, and clicking on an article name would result in the AD source being fetched, parsed to HTML, and displayed in the post viewer. Of course, the tool requires a good touch of styling and 'landscaping' (layout changes) before it can take the shape of an appealing blog.

The app is completely driven by JS, including the parsing of fetched articles. AD parsing is achieved through a custom-made JS library (as the official AsciiDoctor.js library is still fairly bulky) that covers only a commonly-used subset of the AD spec; e.g. it does not currently support tables or [NOTE] fragments. Suggestions and improvements are always welcome, given that they maintain the simplicity and streaming (line-by-line parsing) nature of existing code.

I'm not entirely sure of how this type of dynamic rendering would affect SEO; unless the search engine is really intelligent, it will probably not be able to dynamically load and index the articles. Hence if you are looking for SEO or article visibility, this might not be the best option for you. Anyway, I'm soon going to try it out on my own soon, and I'll keep you posted regarding my progress, at least with the Almighty Google Bot.

Saturday, March 18, 2017

The Dev-Test-Deploy Merry-Go-Round: How AdroitLogic IPS Fits In!

Now that we have heard of what IPS is and what it has in store for you, it's probably time to delve into the details of how it can actually help improve the enterprise integration lifecycle of your organization.

A typical enterprise integration project involves multiple dev-test-deploy cycles. The dev cycles are also usually further decomposable to a certain degree of testing, possibly on the developer's system as well as an actual sandbox inside the deployment environment. QA-level testing is also usually carried out in a sandbox closely resembling the final production environment. Finally, once you are ready to go production, the current version of the project is 'frozen'; any new incoming changes, fixes or features, and you spin off the next cycle (version) of the project, basically on the same, or a slightly modified, set of environments.

In reality, from the project's point of view, what actually changes when you migrate a project across different environments, is its set of externalized project properties and configuration artifacts (external config files, custom libraries and such); the source code and resources bundled inside the project remain unchanged. What does this mean? That you can simply maintain a different set of properties and configurations for each environment, switching them around when you want to follow the dev-test-deploy cycle!

IPS does just this, along with a lot of additional cool features. It abstracts the notion of environments with an entity of the same terminology: environment. Following our dev-test-deploy convention, IPS offers DEVELOPMENT, TEST and PRODUCTION environments for its clusters. Migrating a cluster from one environment to the next is as easy as selecting the new environment from a drop-down list and clicking Deploy.

Each project can have a distinct set of property values for each environment. Projects deployed in a cluster inherit their environment from the cluster runtime, and environment-specific properties and configurations are automatically injected to them at startup. While properties are not directly inherited across project versions, you can copy existing properties across versions of the same project.

For example, suppose you have a cluster http-cluster1 and need to spin off a peoject http-project currently at version v1 (hence the project version named http-project-v1). You would

  1. create and develop the project (/x:project[id] and /x:project[version] in project.xpml set to http-project and v1 respectively) using UltraStudio,
  2. build and upload the project to IPS,
  3. via the edit project version perspective of IPS dashboard, define appropriate values for externalized properties of http-project-v1 for each environment, and
  4. add the project under cluster http-cluster1, along with any custom configuration artifacts, creating a new cluster version, say http-cluster1-v1.

Now, when you are

  1. in a dev cycle, just set http-cluster1 to DEVELOPMENT and redeploy http-cluster1-v1.
  2. in a test/QA cycle, set http-cluster1 to TEST and redeploy http-cluster1-v1.
  3. in the next dev cycle following the QA,
    • upload an updated revision of http-project-v1 to IPS,
    • set http-cluster1 back to DEVELOPMENT,
    • create a new cluster version http-cluster1-v2 with the updated http-project-v1, and
    • deploy http-cluster1-v2.
  4. traversing the dev-test loop, keep on with the environment switch and project update procedure.
  5. ready for production, set http-cluster1 to PRODUCTION and redeploy http-cluster1-vx, the cluster version containing the latest revision of http-project-v1.

Furthermore, when you wish to roll out a new version v2 of http-project, you would simply

  1. build and upload http-project-v2 via the IPS project uploader,
  2. during the configure project properties stage of the upload process, copy properties from http-project-v1 for all environments using the import from project version option,
  3. create a new cluster (say http-cluster2) and deploy http-project-v2 on it,
  4. continue with the dev-test-deploy cycle for http-project-v2 as before, using cluster http-cluster2, and
  5. once http-project-v2 is solid, dump http-cluster1 and migrate http-cluster2 to PRODUCTION.

As you can see, IPS undertakes many of the tricky aspects of the dev-test-deploy cycle such as artifact management, environment and property governance, and versioning, and promises to deliver a smooth enterprise integration workflow for your IT department.

Wednesday, March 15, 2017

UltraStudio in Action, Episode 1: Write Your Own Basic Authenticator!

Basic authentication is perhaps the easiest—though not the most secure—way to control access to your in-house APIs. It allows users to gain access to the API simply by providing their username-password credentials, without the need of advanced encryption or third-party involvement as in OAuth.

We can easily try out basic authentication using UltraESB-X, an easy-to-use ESB (enterprise service bus) with a graphical UI for developing and testing your project on-the-fly. Utilizing the flexibility of the Project-X framework powering UltraESB-X, you can easily write a simple custom processing element to get hands-on experience with a simple basic authentication flow.

For our scenario we shall use the UltraStudio IntelliJ IDEA plugin, which comes bundled with an UltraESB-X instance and provides the graphical UI for developing our little project. UltraStudio can be obtained from here. The license key, which you will receive in a confirmation email, is good for 30 days with the embedded UltraESB-X, i.e. it provides you with a free 30-day trial period for playing around with the product. (Of course you can continue to use UltraStudio even after that as well, although the embedded ESB will require a new license or an extension; or you can build an independent project archive and run it in an external UltraESB-X or in IPS.)

Having said that, let's get to work!

Getting UltraStudio up and running

  1. Download IntelliJ IDEA (if you don't have it already). You may want to upgrade to at least 2016.1 if what you have is older. The Community Edition is free for unlimited use, so no issue there!
  2. Download the UltraStudio plugin for your platform: Linux, Windows or Mac.
  3. Follow the installation guide to install the downloaded plugin.

Now you're ready for all your adventures with UltraStudio!

Getting an API up and running

UltraStudio has a set of pre-built sample projects. One of them, REST service mediation, can be used as the base REST API for our basic auth example. The 'API' doesn't do much by itself—it just acts as a proxy, forwarding incoming queries to openweathermap.org—but it gives us a good starting point for trying out basic auth.

  1. First, make sure you have internet connectivity as UltraStudio pulls some of its resources from online.
  2. In IDEA, select File → New → Project.
  3. In the New Project dialog, select Sample Ultra Project on the left sidebar, and click Next.
  4. Ultra Projects (collections of integration flows, or the successor of deployment units in case you are familiar with the original UltraESB) are Maven projects. So let's enter a Maven Group ID and Version for our project, and also a Base Package name sample.auth.basic.
  5. On the next window, you will see all samples available for UltraStudio. Point to REST Service Mediation and click Download.
  6. When the download is complete, UltraStudio will display a confirmation "Successfully Completed the Download. Click Next to Continue"; at this point, click Next.
  7. In the next step, select the location for saving the new project.
  8. Finally, click Finish to create the project.

The project needs no additional configurations and can be run right away!

  1. Click Reimport All Maven Projects button on the Maven Projects tool window, for IDEA to detect the newly created project as a Maven project.
  2. Click Run → Edit Configurations.
  3. Click + button on the Run/Debug Configurations dialog and add a new UltraESB Server run configuration.
  4. Give a name to your configuration (e.g. RESTProxy) and click OK.
  5. Click Run → Run RESTProxy to run your project in the embedded ESB!

To ensure things are working fine,

  1. Check the last few log lines in the Run window. They should indicate something like:
    2017-03-12T20:41:34,213 [127.0.1.1-janaka-ENVY] [main] [RestServiceMediation-17.01] [310007I010]  INFO NIOHttpListener HTTP NIO Listener on port : 8280 started
    2017-03-12T20:41:34,217 [127.0.1.1-janaka-ENVY] [main] [system-] [145001I013]  INFO XContainer AdroitLogic UltraStudio UltraESB-X server started successfully in 5 seconds and 976 milliseconds
    2017-03-12T20:41:34,219 [127.0.1.1-janaka-ENVY] [HttpNIOListener-8280] [system-] [310007I012]  INFO NIOHttpListener IO Reactor started for HTTP NIO Listener on port : 8280 ...
  2. Follow the instructions mentioned in the sample to send a request to the API, and verify that a valid response is received.

Plugging in Basic Auth

Now that we have a working API (so to speak) we can think of how to 'secure' it with basic auth:

  • Anonymous users should be blocked with an "unauthorized" message.
  • Users that provide an invalid (malformed) authentication header should be blocked with an appropriate error message.
  • Users that provide an invalid username/password should be blocked with just the same error message.
  • Only successfully authenticated users should be allowed to move forth with the API call.

Accordingly we can easily fulfill our requirement by including a custom Basic Authenticator processing element on the flow. Since credentials are simple username-password pairs, we shall load them from a CSV file at startup.

  1. To create a custom processing element, expand src/main/java in the Project tool window, right click sample.auth.basic package and select New → Processing Element.
  2. Provide BasicAuthenticator as the class name.
  3. Modify the content of the newly created source file, as follows:
    • Change the extends clause to AbstractSequencedProcessingElement.
      public class BasicAuthenticator extends AbstractSequencedProcessingElement {
    • Enter value "Basic Authenticator" for attribute displayName of the @ProcessingElement class-level annotation. This is the name that would be displayed in the flow editor.
      @Processor(displayName = "Basic Authenticator", type = ProcessorType.CUSTOM)
    • Add a Map instance variable credentialCache to hold the username-password credentials for authentication.
          private final Map credentialCache = new HashMap<>();
    • The process() method is where the magic happens, where we evaluate the user request for valid credentials. For this we,
      1. extract the Authorization header from the HTTP request,
                XMessage msg = xMessageContext.getMessage();
                Optional authHeader = msg.getFirstStringTransportHeader("Authorization");
      2. decode its credential portion,
                    String authString = new String(Base64.getDecoder().decode(authHeader.get().substring(6)));
                    String[] credentials = authString.split(":", 2);
      3. and see if if matches any of the entries in our credentialCache.
                    if (!credentials[1].equals(credentialCache.get(credentials[0]))) {
      4. If the Authorization header is either absent or malformed (i.e. not a base 64-encoded string made of a colon-separated username-password pair), we modify the message to a 401 HTTP response with an appropriate message and raise an exception to trigger a response with it via the error path.
                if (!authHeader.isPresent()) {
                    throw deauthorize(msg, "Please provide an Authorization header for basic auth");
                }
        ...
                    if (credentials.length != 2) {
                        throw deauthorize(msg, "Authorization header is not in the correct format");
                    }
                    if (!credentials[1].equals(credentialCache.get(credentials[0]))) {
                        throw deauthorize(msg, "Invalid username/password");
                    }
        ...
                } catch (Exception e) {
                    throw deauthorize(msg, "Failed to process the Authorization header: " + e.getMessage());
                }
      5. If everything tallies out, we allow the message to continue by returning an ExecutionResult.SUCCESS from the element.
                    msg.removeTransportHeader("Authorization");
                    return ExecutionResult.SUCCESS;
      6. We need to ensure that our credential map credentialCache is populated by the time our API proxy endpoint is up and serving. We can achieve this using the initElement() method of the processing element, which gets called when the flow (and hence the element) is being initialized at server startup. At the moment we shall simply read a credentials.csv file inside the src/test/resources directory and load its content to our credential map, treating it as a two-column list of username-password pairs:
                try (InputStream is = new FileInputStream(getResource("credentials.csv").getPath())) {
                    Scanner scanner = new Scanner(is);
                    while (scanner.hasNextLine()) {
                        String[] values = scanner.nextLine().split(",");
                        credentialCache.put(values[0], values[1]);
                    }
                } catch (IOException e) {
                    throw new IntegrationRuntimeException("Failed to initialize credential cache", e);
                }

The completed class should look like:

package sample.auth.basic;

import org.adroitlogic.x.annotation.config.Processor;
import org.adroitlogic.x.api.ExecutionResult;
import org.adroitlogic.x.api.IntegrationRuntimeException;
import org.adroitlogic.x.api.XMessage;
import org.adroitlogic.x.api.XMessageContext;
import org.adroitlogic.x.api.config.ProcessorType;
import org.adroitlogic.x.base.format.StringFormat;
import org.adroitlogic.x.base.processor.AbstractSequencedProcessingElement;
import org.springframework.context.ApplicationContext;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.*;

@Processor(displayName = "Basic Authenticator", type = ProcessorType.CUSTOM)
public class BasicAuthenticator extends AbstractSequencedProcessingElement {

    private final Map credentialCache = new HashMap<>();

    @Override
    protected ExecutionResult sequencedProcess(XMessageContext xMessageContext) {
        XMessage msg = xMessageContext.getMessage();
        Optional authHeader = msg.getFirstStringTransportHeader("Authorization");
        if (!authHeader.isPresent()) {
            throw deauthorize(msg, "Please provide an Authorization header for basic auth");
        }

        try {
            String authString = new String(Base64.getDecoder().decode(authHeader.get().substring(6)));
            String[] credentials = authString.split(":", 2);
            if (credentials.length != 2) {
                throw deauthorize(msg, "Authorization header is not in the correct format");
            }
            if (!credentials[1].equals(credentialCache.get(credentials[0]))) {
                throw deauthorize(msg, "Invalid username/password");
            }
            msg.removeTransportHeader("Authorization");
            return ExecutionResult.SUCCESS;

        } catch (IllegalStateException e) {
            throw e;
        } catch (Exception e) {
            throw deauthorize(msg, "Failed to process the Authorization header: " + e.getMessage());
        }
    }

    private IllegalStateException deauthorize(XMessage msg, String reason) {
        msg.setResponseCode(401);
        msg.setPayload(new StringFormat(reason));
        return new IllegalStateException("Authentication failed for message " + msg.getMessageId());
    }

    protected void initElement(ApplicationContext context) {
        try (InputStream is = new FileInputStream(getResource("credentials.csv").getPath())) {
            Scanner scanner = new Scanner(is);
            while (scanner.hasNextLine()) {
                String[] values = scanner.nextLine().split(",");
                credentialCache.put(values[0], values[1]);
            }
        } catch (IOException e) {
            throw new IntegrationRuntimeException("Failed to initialize credential cache", e);
        }
    }
}

Now let's modify the integration flow to include our basic auth element:

  1. While having the BasicAuthenticator loaded in the editor, click Build > Recompile BasicAuthenticator.java to compile the class.
  2. Now open the file src/main/conf/simple-rest-service-mediation.xcml (and switch to the Design tab at the bottom). If the file is already open, click the Refresh View button inside the UI to re-render the view.
  3. Once the refresh is done, you will find the new Basic Authenticator element under Processors → Custom category on the left palette. Drag-and-drop it into the work area.
  4. Rewire the flow such that, before hitting the HTTP Egress Connector, the message flows through the Basic Authenticator:
    1. Delete the path going out from NIO HTTP Listener's Processor port.
    2. Connect the above port to the Input port of Basic Authenticator.
    3. Connect the Next port of Basic Authenticator to the Input port of NIO HTTP Sender.
    4. Connect the On Exception port of Basic Authenticator to the Input (bottommost) port of NIO HTTP Listener (so that it now has 2 incoming connection paths).

Now run the previously created RESTProxy configuration to get the ESB running.

Testing the whole thing

To verify that our basic authenticator indeed works,

  1. Try to access the /service/rest-proxy endpoint in the same way you tried in the original sample (the configuration should still be available in the Ultra Studio Toolbox unless you had already closed IDEA). It would fail with an error response similar to the following, as we are not sending an Authorization header:
    HTTP/1.1 401 Unauthorized
    User-Agent: AdroitLogic (http://adroitlogic.org) - SOA Toolbox/1.5.0
    Host: localhost:8280
    Date: Mon, 13 Mar 2017 01:13:53 GMT
    Server: AdroitLogic UltraStudio UltraESB-X
    Content-Length: 53
    Content-Type: text/plain; charset=ISO-8859-1
    Connection: close
    
    Please provide an Authorization header for basic auth
  2. Now let's try an invalid Authorization header.
    1. Switch to the configuration view on the Toolbox HTTP client (via the Show/Hide config panel button).
    2. Switch to the Custom Headers tab.
    3. Add a custom header with key Authorization and a malformed value base64 encoded values cannot contain spaces.
    4. Now, sending another request, we see another 401 response indicating tha our malformed Authorization header could not be decoded:
      HTTP/1.1 401 Unauthorized
      Authorization: base64 encoded values cannot contain spaces
      User-Agent: AdroitLogic (http://adroitlogic.org) - SOA Toolbox/1.5.0
      ...
      
      Failed to process the Authorization header: Illegal base64 character 20
  3. Now, for an invalid username/password:
    1. Remove the previous custom header (using the - button).
    2. Switch to Authentication tab.
    3. Enter a username that is not included in the credentials CSV (or a correct username with a wrong password; both would have the same effect) and try again:
      HTTP/1.1 401 Unauthorized
      Authorization: Basic YWRtaW46cGFzc3dvcmQ=
      User-Agent: AdroitLogic (http://adroitlogic.org) - SOA Toolbox/1.5.0
      ...
      
      Invalid username/password
      See? It's working! But we need to verify one more thing: whether a correct username-password pair can actually get us through :)
  4. On the Authorization tab, enter a correct username-password pair and try again. You shall receive a 200 response with the intended payload.
    HTTP/1.1 200 OK
    Access-Control-Allow-Origin: *
    Access-Control-Allow-Credentials: true
    Access-Control-Allow-Methods: GET, POST
    X-Cache-Key: /data/2.5/weather?APPID=________________________________&q=london
    Date: Mon, 13 Mar 2017 01:25:32 GMT
    Content-Type: application/json; charset=utf-8
    Server: AdroitLogic UltraStudio UltraESB-X
    Content-Length: 449
    Connection: close
    
    {"coord":{"lon":-0.13,"lat":51.51},"weather":[{"id":803,"main":"Clouds","description":"broken clouds","icon":"04n"}],"base":"stations","main":{"temp":281.94,"pressure":1022,"humidity":81,"temp_min":280.15,"temp_max":284.15},"visibility":10000,"wind":{"speed":3.1,"deg":300},"clouds":{"all":75},"dt":1489366200,"sys":{"type":1,"id":5091,"message":0.0061,"country":"GB","sunrise":1489385925,"sunset":1489428120},"id":2643743,"name":"London","cod":200}
  5. Finally, let's also see what would happen if we are successfully authenticated but the backend is unavailable:
    1. Disconnect from the internet.
    2. Retry the last request.
    3. You shall receive a 500 error, while a corresponding exception (possibly UnknownHostException) would be thrown on the ESB log:
      HTTP/1.1 500 Internal Server Error
      Date: Mon, 13 Mar 2017 01:27:29 GMT
      Server: AdroitLogic UltraStudio UltraESB-X
      Content-Length: 21
      Content-Type: text/plain; charset=ISO-8859-1
      Connection: close
      
      Internal Server Error
      ESB log:
      2017-03-13T06:57:29,908 [127.0.1.1-janaka-ENVY] [pool-2-thread-8] [system-] [110801E005] ERROR MessageReceiver Error occurred while processing message context : ae96376f-4ba0-0539-0000-000000000005
       java.net.UnknownHostException: api.openweathermap.org
      	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.validateAddress(DefaultConnectingIOReactor.java:245) ~[httpcore-nio-4.4.jar:4.4]
      	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processSessionRequests(DefaultConnectingIOReactor.java:264) ~[httpcore-nio-4.4.jar:4.4]
      	at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:141) ~[httpcore-nio-4.4.jar:4.4]
      	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:350) ~[httpcore-nio-4.4.jar:4.4]
      	at org.adroitlogic.x.transport.http.nio.AbstractNIOHttpSender.startIOReactor(AbstractNIOHttpSender.java:195) ~[x-transport-nio-http-17.01.jar:?]
      	at org.adroitlogic.x.transport.http.nio.AbstractNIOHttpSender.access$500(AbstractNIOHttpSender.java:65) ~[x-transport-nio-http-17.01.jar:?]
      	at org.adroitlogic.x.transport.http.nio.AbstractNIOHttpSender$3.run(AbstractNIOHttpSender.java:174) ~[x-transport-nio-http-17.01.jar:?]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
      2017-03-13T06:57:29,915 [127.0.1.1-janaka-ENVY] [pool-2-thread-8] [system-] [310401E006] ERROR XHttpAsyncRequestHandler Error occurred while processing the message java.net.UnknownHostException: api.openweathermap.org

While this is only a very crude example of basic auth, the flexibility of the new processing element model of UltraESB-X permits you to go to whatever level of sophistication required for robust operation, yet maintaining a clean and easy-to-maintain integration flow.