Sunday, April 22, 2018

Deploying your stuff with Google Cloud Deployment Manager: via NodeJS

This may not be the correct way; heck, this may be the crappiest way. I'm putting this up because I could not find a single decent sample on how to do it with JS.

The approach in this post uses NodeJS (server-side), but it is possible to do the same on the client side by loading the Google API client and subsequently the deploymentmanager v2 module; I'll write about it as well, if/when I get a chance.

First you set up authentication so your googleapis client can obtain a token automatically.

Assuming that you have added the googleapis:28.0.1 NPM module to your dependencies list, and downloaded your service account key into the current directory (where the deploymentmanager-invoking code is residing):

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

I used a service account, so YMMV.

If you like, you can cache the token at dev time by adding some more gimmicks: I used axios-debug-log to intercept the auth response and persist the token to a local file, from which I read the token during subsequent runs (if the token expires the JWT client will automatically refresh it, which I will then persist):

process.env.log = "axios";
const tokenFile = "./token.json";
require("axios-debug-log")({
    // disable extra logging
    request: function (debug, config) {},
    error: function (debug, error) {},
    response: function (debug, response) {
        // grab and save access token for reuse
        if (response.data.access_token) {
            console.log("Updating token");
            require("fs").writeFile(tokenFile, JSON.stringify(response.data));
        }
    },
});

// load saved token; if success, use OAuth2 client with loaded token instead of JWT client
// (avoid re-auth at each run)
try {
    const token = require(tokenFile);
    if (!token.access_token) {
        throw Error("no token found");
    }
    token.refresh_token = token.refresh_token || "refreshtoken";    //mocking
    console.log("Using saved tokens from", tokenFile);
    jwtClient.setCredentials(token);
} catch (e) {
    console.log(e.message);
}

Fair enough. Now to get the current state of the deployment:

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

const deployments = google.deploymentmanager("v2").deployments;

let fingerprint = null;

let deployment = deployments.get({
    project: projectId,
    deployment: deployment
})
    .then(response => {
        fingerprint = response.data.fingerprint;
        console.log("Fingerprint", fingerprint);
        return Promise.resolve(response);
    })
    .then(response => {
        // continue the logic
    });

The "fingerprint logic" is needed because we need to pass a "fingerprint" to every "write" (update (preview/start), stop, cancelPreview etc.) operation in order to guarantee in-order execution and operation synchronization.

That done, we set up an update for our deployment by creating a deployment preview (shell) within the last .then():

    .then(response => {
        console.log("Creating deployment preview", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                /* your resource definitions here; e.g.

                                {
                                    name: "myGcsBucket",
                                    type: "storage.v1.bucket",
                                    properties: {
                                        storageClass: "STANDARD",
                                        location: "US",
                                        labels: {
                                            "keyOne": "valueOne"
                                        }
                                    }
                                }
                                
                                and so on */
                            ]
                        }, 4, 2)
                    }
                }
            }
        })
            .catch(e => err("Failed to preview deployment", e))
    })

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

Notice that we passed fingerprint as part of the payload. Without it, Google would complain that it expected one.

But now, we again need to call deployments.get() because the fingerprint would have been updated! (Why the heck doesn't Google return the fingerprint in the response itself?!)

Maybe it's easier to just wrap the modification calls inside a utility code snippet:

const filter = {
    project: projectId,
    deployment: deployment
};

const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

// ...

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            // same payload from previous code block
        }))
            .catch(e => err("Failed to preview deployment", e))
    })

True, it's nasty to have a global fingerprint variable. You can pick your own way.

Meanwhile, if the initial deployments.get() fails due to a deployment being not found by the given name, we can create one (along with a preview) right away:

    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

Now let's keep on "monitoring" the preview until it reaches a stable state (DONE, CANCELLED etc.):

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

let timer = {
    handle: null
};
preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

            })
    }, timer, 5000);
});

And check if we reached a terminal (completion) state:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

// ...

            .then(response => {
                // ...

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;
                }
                return true;

If we reach a success state, we can commence the actual deployment:

                        // ...
                        } else {
                            deploy();
                        }

// ...

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            fingerprint: fingerprint,
            resource: {
                name: deployment
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

And start monitoring again, until we reach a completion state:

    // ...

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

Recap:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

let timer = {
    handle: null
};

const deployments = google.deploymentmanager("v2").deployments;

const filter = {
    project: projectId,
    deployment: deployment
};

let fingerprint = null;
const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                // your resource definitions here
                            ]
                        }, 4, 2)
                    }
                }
            }
        }))
            .catch(e => err("Failed to preview deployment", e))
    })
    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            deploy();
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;  // stop
                }
                return true;  // continue
            })
    }, timer, 5000);
});

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            resource: {
                name: deployment,
                fingerprint: fingerprint
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

That should be enough to get you going.

Good luck!

Serverless is the new Build Server: Google CloudBuild (Container Builder) via NodeJS

Google's CloudBuild (a.k.a. "Container Builder") is an on-demand, container-based build service offered under the Google Cloud Platform (GCP). For you and me, it is a nice alternative to maintaining and paying for our own build server, and a clever addition to anyone's CI stack.

CloudBuild allows you to start from a source (e.g. a Google Cloud Source repo, a GCS bucket - or perhaps even nothing (a blank directory; "scratch"), incrementally apply several Docker container runs upon it, and publish the final result to a desired location: like a Docker repository or a GCS bucket.

With its wide variety of custom builders, CloudBuild can do almost anything - that is, as far as I see so far, anything that can be achieved by a Docker container and a volume mount can be fulfilled in CloudBuild as well. To our great satisfaction, this includes fetching sources from GitHub/BitBucket repos (in addition to the native source location options), running custom commands like zip, and much more!

Above all this (how nice of GCP!), CloudBuild gives you 2 whole hours (120 minutes) of build time per day, for free - in comparison to the comparable CodeBuild service of AWS, which offers just 1 hour and 40 minutes per month!

So, now let's have a look at how we can run a CloudBuild via JS (server-side NodeJS):

First things first: adding googleapis:28.0.1 to our dependency list;

{
  "dependencies": {
    "googleapis": "28.0.1"
  }
}

Don't forget the npm install!

In our logic flow, first we need to get ourselves authenticated; with the google-auth-library module that comes with googleapis, this is quite straightforward because the client can be fed with a JWT auth client right from the beginning, which will handle all the auth stuff behind the scenes:

const projectId = "my-gcp-project-id";

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

Note that, for the above code to work verbatim, you need to place a service account key file in the current directory (usually obtained by creating a new service account via the Google Cloud console, in case you don't already have one).

Now we can simply retrieve the v1 version of the cloudbuild client from google, and start our magic:

const builds = google.cloudbuild("v1").projects.builds;

First we submit a build "spec" to the CloudBuild service. Below is an example for a typical NodeJS module on GitHub:

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "https://github.com/slappforge/slappforge-sdk", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "slappforge-sdk.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "slappforge-sdk.zip",
                    "gs://sdk-archives/slappforge-sdk/$BUILD_ID/slappforge-sdk.zip"
                ]
            }
        ]
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Basically we retrieve the source from GitHub, fetch the dependencies via a npm install, bundle the whole thing using a zip command container (took me a while to figure it out, which is why I'm posting this!) and upload the resulting zip to a GCS bucket.

We can tidy this up a bit (and perhaps make the template reusable for subsequent builds, by extracting out the parameters into a substitutions section:

const repoUrl = "https://github.com/slappforge/slappforge-sdk";
const projectName = "slappforge-sdk";
const bucket = "sdk-archives";

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "$_REPO_URL", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "$_PROJECT_NAME.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "$_PROJECT_NAME.zip",
                    "gs://$_BUCKET_NAME/$_PROJECT_NAME/$BUILD_ID/$_PROJECT_NAME.zip"
                ]
            }
        ],
        substitutions: {
            _REPO_URL: repoUrl,
            _PROJECT_NAME: projectName,
            _BUCKET_NAME: bucket
        }
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Once the build is started, we can monitor it like so (with a few, somewhat neat wrappers to properly manage the timer logic):

    .then(response => {
        let timer = {
            handle: null
        };

        startTimer(() => {
            return builds.get({
                projectId: projectId,
                id: response.data.metadata.build.id
            })
                .catch(e => {
                    throw e;
                })
                .then(response => {
                    const COMPLETE_STATES = ["SUCCESS", "DONE", "FAILURE", "CANCELLED"];
                    if (COMPLETE_STATES.includes(response.data.status)) {
                        return false;
                    }
                    return true;
                })
        }, timer, 5000);
    });

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

Once the build reaches a steady state, you are done!

If you want fine-grained details, just dig into response.data within the timer callback blocks.

Happy CloudBuilding!

Friday, April 20, 2018

Serverless: a no-brainer!

Few years ago, containers swept through the dev and devops lands like a category-6 hurricane.

Docker. Rkt. others.

Docker Swarm.

K8s.

OpenShift.

Right now we are literally at the epicenter, but when we glimpse at the horizon we see another one coming!

Serverless.

The funny thing is, "serverless" itself is a misnomer.

Of course there are servers. There are always servers. How can programs execute themselves in thin air, without the support of the underlying hardware and utility modules? So, there are servers.

Just not where you would expect them to be.

Traversing the timeline of computing, we see the turbulent track record?? of servers: birth in secret dungeons of vaccum tubes and city-scale power supplies; multi-ton boxes; networks; clusters; cloud datacenters and server farms (agriculture just lost its royalty!); containers.

Over time, we see servers losing their significance. Gradually, but steadily.

And now, suddenly, puff! They are gone.

Invisible, to be precise.

With serverless, you no longer care about the server. It may be a physical machine, a cloud VM, a K8s pod, an ECS container... heck, even an IoT rig.

Nobody cares, as long as the job gets done.

In this sense, we realize that serverless is nothing new; the concept, and even some pracical implementations, have been there since as far back as 2006. You yourself may have benefited from serverless (or conceptually serverless) architectures; while one may argue them to be PaaSes, Google App Engine and Google Apps Script (especially) are good examples from my Google-ridden "fungramming" history.

Just like touchscreens, serverless resemblances had always been there, but never has the marketing hype been this intense - obviously it is growing, and we'll surely see more of it as time flies by.

AWS had an early entry to the arena and currently owns a huge market share, bigger than all the others combined; Azure is behind, but catching up fast; and Google still seems to be more focused on Kubernetes and related containerization stuff although they too are on the track with Cloud Functions and Firebase.

Streaming and event-driven architectures are playing their part in bringing value to serverless. We should also not forget the cloud hype that made people go every-friggin-thing-as-a-service and later left them wondering how they could pay only for what they really use, only while they use it.

All ramblings aside, serverless is growing in popularity. Platforms are evolving to support more event sources, better integration support for other services and richer monitoring and statistics. Frameworks like Serverless are striving to provide a unified and generifier serverless development experience while IDEs like Sigma are doing their part in helping newbies (and sometimes even professionals) get going with serverless with minimum hassle and maximum speed.

Being new and shiny does not necessarily mean that serverless is the silver bullet for all your dev issues; in fact, right now it fits into only a few enterprise use cases (primarily due to the lack of strong guarantees, which are quite commonplace in the bureaucratic enterprise atmosphere). Nevertheless, providers are already working on this, and we can expect some disruptive - if not revolutionary - changes in the not-too-distant future. However it is always best to re-iterate your requirements before officially stepping into the serverless world, because serverless demands quite a shift in your application architecture, devops, as well as the very core of your developer mindset.

And, of course, the best way to pick the cake is to taste it yourself.

Sigma the Serverless IDE: resources, triggers, and heck, operations

With serverless, you stopped caring about the server.

With Sigma, you stopped (or will stop, if not already) about the platform.

Now all you care about is your code - the bliss of every programmer.

Or is it?

I hold her (the code) in my arms.

If you have done time with serverless frameworks, you would already know how they take away your platform-phobia, abstracting out the platform-specific bits of your serverless app.

And if you have already tried out Sigma, you would have noticed how it takes things further, relieving you of the burden of the configuration and deployment aspects as well.

Sigma for a healthier dev life!

Leaving behind just the code.

Just the beautiful, raw code.

So, what's the catch?

Okay. Now for the untold, unspoken, not-so-popular part.

You see, my friend, every good thing comes at a price.

Lucky for you, with Sigma, the price is very affordable.

Just a matter of sticking to a few ground rules while you develop your app. That's all.

Resources, resources.

All of Sigma's voodoo depends on one key thing: resources.

Resources, resources!

The concept is quite simple: every piece of your serverless app - may it be a DynamoDB table, S3 bucket or SNS topic - is a resource from Sigma's point of view.

If you remember the Sigma UI, the Resources pane on the left contains different resource types that you can have in your serverless app. (True, it's pretty short; but we're working on it :))

Resources pane in Sigma UI

Behind the scenes

When you drag a resource from this pane, into your code, Sigma secretly creates a resource (which it would later deploy into your serverless provider) to track the configurations of the actual service entity (say, the S3 bucket that should exist in AWS by the time your function is running) and all its usages within your code. The tracking is fully automated; frankly, you didn't even want to know about that.

Sigma tracks resources in your app, and deploys them into the underlying platform!

"New" or "existing"?

On almost all of Sigma's resource configuration pop-ups, you may have noticed two options: "new" vs "existing". "New" resources are the ones that would be (or have already been) created as a result of your project, whereas "existing" ones are those which have been created outside of your project.

Now that's a tad bit strange because we would usually use "existing" to denote things that "exist", regardless of their origin - even if they came from Mars.

Better brace yourself, because this also gives rise to a weirder notion: once you have deployed your project, the created resources (which now "exist" in your account) are still treated by Sigma as "new" resources!

And, as if that wasn't enough, this makes the resources lists in Sigma behave in totally unexpected ways; after you define a "new" resource, whenever you want to reuse that resource somewhere else, you would have to look for it under the "existing" tab of the resource pop-up; but it will be marked with a " (new)" prefix because, although it is already defined, it remains "new" from Sigma's point of view.

Now, how sick is that?!

Bang head here.

Perhaps we should have called them "Sigma" resources; or perhaps even better, "project" resources; while we scratch our heads, feel free to chip in and help us with a better name!

Rule o' thumb

Until this awkwardness is settled, the easiest way to get through this mess is to stick to this rule of thumb:


If you added a resource to your current Sigma project, Sigma would treat it as a "new" resource till the end of eternity.


Bottom line: no worries!

Being able to use existing resources is sometimes cool, but it means that your project would be much less portable. Sigma will always assume that the resources referenced by your project are already in existence, regardless of whatever AWS account you attempt to deploy it. At least until (if) we (ever) come up with a different resource management mechanism.


If you want portability, always stick to new resources. That way, even if a complete stranger gets hold of your project and deploys it in his own, alien, unheard-of AWS account, the project would still work.

If you are integrating with an already existing set of resources (e.g. the set of S3 bucket in your already-running dev/test/prod environment), using existing resources is the obvious (and the most convenient) choice.


Anyways, back to our discussion:

Where were we?

Ah, yes. Resources.

The secret life of resources

In a serverless app, you basically use resources for two things:

  • for triggering the app (as an event source, a.k.a. trigger)
  • for performing work inside the app, such as invoking external services

triggers and operations

Resources? Triggers?? Operations???

Sigma also associates its resources with your serverless app in a similar fashion:

In Sigma, a function can have several triggers (as long as the application itself is aware of tackling different trigger event types!), and can contain several operations (obviously).

Yet, they're different.

It is noteworthy that a resource itself is not a trigger or an operation; triggers and operations are associated with resources (they kind of "bridge" functions and resources) but a resource has its own independent life. As a result, a resource can power many triggers (to be precise, zero or more) and get involved in many operations, across many (again, zero or more) functions.

A good example is S3. If you want to write an image resizer function that would pick and process images dropped into a S3 bucket, you would configure a S3 trigger to invoke the function upon the file drop, and a S3 GetObject operation to retrieve and process the file; however, both will point to the same S3 resource, namely the bucket where images are being dropped into and fetched from.

Launch time!

At deployment, Sigma will take care of putting the pieces together - trigger configs, runtime permissions and whatnot - based on which function is associated with which resources, and in which ways (trigger-mode vs operation-mode). You can simply drag, drop and configure your buckets, queues and stuff, write your code, and totally forget about the rest!

That's the beauty of Sigma.

When a resource is "abandoned" (meaning that it is not used in any trigger or operation), it shows up in the "unused resources" list (remember the dustbin button on the toolbar?) and can be removed from the project; remember that if you do this, provided that the resource is a "new" one (rule of thumb: one created in Sigma), it will be automatically removed from your serverless provider account (for example, AWS) during your next deployment!

So there!

if Sigma's resource model (the whole purpose of this article) looks like a total mess-up to you, feel free to raise your voice on StackOverflow - or better still, our GitHub space, FB page or Twitter feed; we would appreciate it very much!

Of course, Sigma has nothing to hide; if you check your AWS account after a few Sigma deployments, you would realize the things we have been doing under the hood.

All of it, to make your serverless journey as smooth as possible.

And easy.

And fun. :)

Welcome to the world of Sigma!

Thursday, April 19, 2018

Sigma QuickBuild: Towards a Faster Serverless IDE

TL;DR

The QuickBuild/QuickDeploy feature described here is pretty much obsoleted by the test framework (ingeniously hacked together by @CWidanage), that gives you a much more streamlined dev-test experience with much better response time!


In case you hadn't noticed, we have recently been chanting about a new Serverless IDE, the mighty SLAppForge Sigma.

With Sigma, developing a serverless app becomes as easy as drag-drop, code, and one-click-Deploy; no getting lost among overcomplicated dashboards, no eternal struggles with service entities and their permissions, no sailing through oceans of docs and tutorials - above all that, nothing to install (just a web browser - which you already have!).

So, how does Sigma do it all?

In case you already tried Sigma and dug a bit deeper than just deploying an app, you may have noticed that it uses AWS CodeBuild under the hood for the build phase. While CodeBuild gives us a fairly simple and convenient way of configuring and running builds, it has its own set of perks:

  • CodeBuild takes a significant time to complete (sometimes close to a minute). This may not be a problem if you just deploy a few sample apps, but it can severely impair your productivity - especially when you begin developing your own solution, and need to reflect your code updates every time you make a change.
  • The AWS Free Tier only includes 100 minutes of CodeBuild time per month. While this sounds like a generous amount, it can expire much faster than you think - especially when developing your own app, in your usual trial-and-error cycles ;) True, CodeBuild doesn't cost much either ($0.005 per minute of build.general1.small), but why not go free while you can? :)

Options, people?

Lambda, on the other hand, has a rather impressive free quota of 1 million executions and 3.2 million seconds of execution time per month. Moreover, traffic between S3 and Lambda is free as far as we are concerned!

Oh, and S3 has a free quota of 20000 reads and 2000 writes per month - which, with some optimizations on the reads, is quite sufficient for what we are about to do.

2 + 2 = ...

So, guess what we are about to do?

Yup, we're going to update our Lambda source artifacts in S3, via Lambda itself, instead of CodeBuild!

Of course, replicating the full CodeBuild functionality via a lambda would need a fair deal of effort, but we can get away with a much simpler subset; read on!

The Big Picture

First, let's see what Sigma does when it builds a project:

  • prepare the infra for the build, such as a role and an S3 bucket, skipping any that already exist
  • create a CodeBuild project (or, if one already exists, update it to match the latest Sigma project spec)
  • invoke the project, which will:
    • download the Sigma project source from your GitHub repo,
    • run an npm install to populate its dependencies,
    • package everything into a zip file, and
    • upload the zip artifact to the S3 bucket created above
  • monitor the project progress, and retrieve the URL of the uploaded S3 file when done.

And usually every build has to be followed by a deployment; to update the lambdas of the project to point to the newly generated source archive; and that means a whole load of additional steps!

  • create a CloudFormation stack (if one does not exist)
  • create a changeset that contains the latest updates to be published
  • execute the changeset, which will, at the least, have to:
    • update each of the lambdas in the project to point to the new source zip file generated by the build, and
    • in some cases, update the triggers associated with the modified lambdas as well
  • monitor the stack progress until it gets through with the update.

All in all, well over 60-90 seconds of your precious time - all to accommodate perhaps just one line (or how about one word, or one letter?) of change!

Can we do better?

At first glance, we see quite a few redundancies and possible improvements:

  • Cloning the whole project source from scratch is overkill, especially when only a few lines/files have changed.
  • Every build will download and populate the NPM dependencies from scratch, consuming bandwidth, CPU cycles and build time.
  • The whole zip file is now being prepared from scratch after each build.
  • Since we're still in dev, running a costly CF update for every single code change doesn't make much sense.

But since CodeBuild invocations are stateless and CloudFormation's resource update logic is mostly out of our hands, we don't have the freedom to meddle with many of the above; other than simple improvements like enabling dependency caching.

Trimming down the fat

However, if we have a lambda, we have full control over how we can simplify the build!

If we think about 80% - or maybe even 90% - of the cases for running a build, we see that they merely involve changes to application logic (code); you don't add new dependencies, move your files around or change your repo URL all the time, but you sure as heck would go through an awful lot of code edits until your code starts behaving as you expect it to!

And what does this mean for our build?

80% - or even 90% - of the time, we can get away by updating just the modified files in the lambda source zip, and updating the lambda functions themselves to point to the updated file!

Behold, here comes QuickDeploy!

And that's exactly what we do, with the QuickBuild/QuickDeploy feature!

Lambda to the rescue!

QuickBuild uses a lambda (deployed in your own account, to eliminate the need for cross-account resource access) to:

  • fetch the latest CodeBuild zip artifact from S3,
  • patch the zip file to accommodate the latest code-level changes, and
  • upload the updated file back to S3, overriding the original zip artifact

Once this is done, we can run a QuickDeploy which simply sends an UpdateFunctionCode Lambda API call to each of the affected lambda functions in your project, so that they can scoop up the latest and greatest of your serverless code!

And the whole thing does not take more than 15 seconds (give or take the network delays): a raw 4x improvement in your serverless dev workflow!

A sneak peek

First of all, we need a lambda that can modify an S3-hosted zip file based on a given set of input files. While it's easy to make with NodeJS, it's even easier with Python, and requires zero external dependencies as well:

Here we go... Pythonic!

import boto3

from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED

s3_client = boto3.client('s3')

def handler(event, context):
  src = event["src"]
  if src.find("s3://") > -1:
    src = src[5:]
  
  bucket, key = src.split("/", 1)
  src_name = "/tmp/" + key[(key.rfind("/") + 1):]
  dst_name = src_name + "_modified"
  
  s3_client.download_file(bucket, key, src_name)
  zin = ZipFile(src_name, 'r')
  
  diff = event["changes"]
  zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
  
  added = 0
  modified = 0
  
  # files that already exist in the archive
  for info in zin.infolist():
    name = info.filename
    if (name in diff):
      modified += 1
      zout.writestr(info, diff.pop(name))
    else:
      zout.writestr(info, zin.read(info))
  
  # files in the diff, that are not on the archive
  # (i.e. newly added files)
  for name in diff:
    info = ZipInfo(name)
    info.external_attr = 0755 << 16L
    added += 1
    zout.writestr(info, diff[name])
  
  zout.close()
  zin.close()
  
  s3_client.upload_file(dst_name, bucket, key)
  return {
    'added': added,
    'modified': modified
  }

We can directly invoke the lambda using the Invoke API, hence we don't need to define a trigger for the function; just a role with S3 full access permissions would do. (We use full access here because we would be reading from/writing to different buckets at different times.)

CloudFormation, you beauty.

From what I see, the coolest thing about this contraption is that you can stuff it all into a single CloudFormation template (remember the lambda command shell?) that can be deployed (and undeployed) in one go:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  zipedit:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: zipedit
      Handler: index.handler
      Runtime: python2.7
      Code:
        ZipFile: >
          import boto3
          
          from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED
          
          s3_client = boto3.client('s3')
          
          def handler(event, context):
            src = event["src"]
            if src.find("s3://") > -1:
              src = src[5:]
            
            bucket, key = src.split("/", 1)
            src_name = "/tmp/" + key[(key.rfind("/") + 1):]
            dst_name = src_name + "_modified"
            
            s3_client.download_file(bucket, key, src_name)
            zin = ZipFile(src_name, 'r')
            
            diff = event["changes"]
            zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
            
            added = 0
            modified = 0
            
            # files that already exist in the archive
            for info in zin.infolist():
              name = info.filename
              if (name in diff):
                modified += 1
                zout.writestr(info, diff.pop(name))
              else:
                zout.writestr(info, zin.read(info))
            
            # files in the diff, that are not on the archive
            # (i.e. newly added files)
            for name in diff:
              info = ZipInfo(name)
              info.external_attr = 0755 << 16L
              added += 1
              zout.writestr(info, diff[name])
            
            zout.close()
            zin.close()
            
            s3_client.upload_file(dst_name, bucket, key)
            return {
                'added': added,
                'modified': modified
            }
      Timeout: 60
      MemorySize: 256
      Role:
        Fn::GetAtt:
        - role
        - Arn
  role:
    Type: AWS::IAM::Role
    Properties:
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
        - Action: sts:AssumeRole
          Effect: Allow
          Principal:
            Service: lambda.amazonaws.com

Moment of truth

Once the stack is ready, we can start submitting our QuickBuild requests to the lambda!

// assuming auth stuff is already done
let lambda = new AWS.Lambda({region: "us-east-1"});

// ...

lambda.invoke({
  FunctionName: "zipedit",
  Payload: JSON.stringify({
    src: "s3://bucket/path/to/archive.zip",
    changes: {
      "path/to/file1/inside/archive": "new content of file1",
      "path/to/file2/inside/archive": "new content of file2",
      // ...
    }
  })
}, (err, data) => {
  let result = JSON.parse(data.Payload);
  let totalChanges = result.added + result.modified;
  if (totalChanges === expected_no_of_files_from_changes_list) {
    // all izz well!
  } else {
    // too bad, we missed a spot :(
  }
});

Once QuickBuild has completed updating the artifact, it's simply a matter of calling UpdateFunctionCode on the affected lambdas, with the S3 URL of the artifact:

lambda.updateFunctionCode({
  FunctionName: "original_function_name",
  S3Bucket: "bucket",
  S3Key: "path/to/archive.zip"
})
.promise()
.then(() => { /* done! */ })
.catch(err => { /* something went wrong :( */ });

(In our case the S3 URL remains unchanged (because our lambda simply overwrites the original file), but it still works because the Lambda service makes a copy of the code artifact when updating the target lambda.)

To speed up the QuickDeploy for multiple lambdas, we can even parallelize the UpdateFunctionCode calls:

Promise.all(
  lambdaNames.map(name =>
    lambda.updateFunctionCode({ /* params */ })
    .promise()
    .then(() => { /* done! */ }))

.then(() => { /* all good! */ })
.catch(err => { /* failures; handle them! */ });

And that's how we gained an initial 4x improvement in our lambda deployment cycle, sometimes even faster than the native AWS Lambda console!

Thursday, March 22, 2018

Serverless ZipChamp: Update your zip files in S3, (almost) in-place!

Simple Storage Service, more commonly known as S3, is the second most popular among AWS-offered cloud services. Originally meant to be a key-value store, it eventually transformed into one of the world's most popular file storage services. Its simple (just like the name!), intuitive, filesystem-like organization, combined with handy features like bucket hosting and CDN support via CloudFront, has made S3 an ideal choice for content hosting and delivery among organizations of all sizes.

AWS S3: simple storage

However, when it comes to using S3 as a filesystem, there are some really cool features that I miss; and I'm sure others do, too. A good example is extra support for archive-type entries (e.g. zip files); right now if you want to change a zip file on S3, you need to download it into a real filesystem (local, EC2, etc.), unpack (unzip) it, make the modification, repack (zip) it and upload it back to S3. Just think how cool it would have been possible to do this in-place in S3 itself, just like how archives can be modified on-the-fly in OSs like Ubuntu!

Of course, my dream is out-of-scope for S3, as it is simply supposed to provide key-value type storage. But it would be really cool if AWS guys could provide it as a feature, or if I could hack together a way to do it in-cloud (in-AWS) without having to sacrifice my local bandwidth or spin up an EC2 instance every time a modification is needed.

And the good news is: thanks to the recent advent of AWS Lambda, we can now hack together a lambda function to do just the thing for us!

AWS Lambda: serverless compute

If you're a total newbie, AWS Lambda allows you to host a snippet of logic (well, maybe even a fully-fledged application) that can be triggered by an external event (like an API call or a timer trigger). No need to run a virtual machine or even deploy a container to host your code; and the best part is, you only pay for the actual runtime - makes sense, because there is practically nothing running (meaning zero resource consumption) while your lambda is idle.

Ah, and I lied.

The best part is, with the AWS free tier, you get a quota of 1 million executions per month, for free; and 3.2 million seconds of execution time, also for free. At full capacity, this means that your lambda can run one million times each month, every invocation taking up to 3.2 seconds, before AWS starts charging for it; and even that, in tiny increments (just $0.00001667 per GB-second).

Wow, I can't think of a better place to host my own S3 zip file editor; can you? I won't be using it much (maybe a couple times a month), and whenever I need it, it will just come alive, do its job, and go back to sleep. Now, how cool is that?!

As if that wasn't enough, downloading and uploading content between S3 and lambda (in the same region) is free of charge; meaning that you can edit all the files you want, no matter how large (imagine a huge archive where you need to change only a teeny weeny 1 KB configuration file) without spending an extra cent!

Okay, let's roll!

Sigma: Think Serverless!

We'll be using Sigma, the brand new serverless IDE, in our dev path. There are some good reasons:

  • Firstly, Sigma is completely browser-based; nothing to install, except for a web browser - which you already have!
  • Secondly, Sigma completely takes care of your AWS stuff, including the management of AWS entities (lambdas, S3 buckets, IAM roles and whatnot) and their interconnections - once you give it the necessary AWS account (IAM) permissions, you don't have to open up a single AWS console ever again!
  • Last but not the least, Sigma automagically generates and maintains all bits and pieces of your project - including lambda trigger configurations, execution roles and permissions and related resources under a single CloudFormation stack (kind of like a deployment "definition"; somewhat like AWS SAM in case you're familiar, but much more flexible as Sigma allows integration with already existing AWS entities as well).

(In fact, behind the scenes, Sigma also leverages this "edit a zip file in S3" approach to drive its new QuickBuild feature; which will go public pretty soon!)

If the above sounded totally Greek to you, chill out! Lambda - or AWS, for that matter - may not look totally newbie-friendly, but I assure you that Sigma will make it easy - and way much fun - to get started!


TL,DR: if you're in a hurry, you can simply open the ready-made sample from my GitHub repo into Sigma and deploy it right away; just remember to edit the two S3 operations (by clicking the two tiny S3 icons in front of the s3.getObject() and s3.putObject() calls) to point to your own bucket, instead of mine - otherwise, your deployment will fail!


Okay, time to sign up for Sigma; here's the official guide, and we'll be putting out a video pretty soon as well!

When you are in, create a new project (with a nice name - how about zipchamp?).

your brand new Sigma project

Sigma will show you the editor right away, with boilerplate code for a default lambda function named, well, lambda.js. (Sigma currently supports NodeJS, and we will surely be introducing more languages in the near future!)

Sigma showing our brand-new project 'zipchamp'

For this attempt, we'll only focus on updating small textual files inside an S3 archive (don't worry; the archive itself could still be pretty big!), rather than binaries or large files. We will invoke the lambda with a payload representing a map of file changes: with keys representing the paths of files inside the archive, and values representing their content (to be added/modified, or to be removed if set to null).

{
    "path": "path/to/zip/file/within/bucket",
    "changes": {
        "path/to/new/file/1": "content for file 1",
        "path/to/existing/file/2": "new content of file 2",
        "path/to/to-be-deleted/file/3": null
    }
}

Sending such a map-type payload to a lambda would be quite easy if we use JSON over HTTP; luckily lambda provides direct integration with API Gateway, just the thing we would have been looking for.

Firstly, let's add the jszip dependency to our project, via the Add Dependency toolbar button, bestowing upon ourselves the power to modify zip files:

'Add Dependency' button

searching for 'jszip' dependency

Time to add a trigger to our function, so that it can be invoked externally (in our case, via an HTTP request coming through API Gateway). Click the API Gateway entry on the Resources pane on the left, drag it into the editor, and drop it right on to the function header (contaning the event parameter highlighted in red, with a red lightning symbol in front):

dragging API Gateway trigger

Sigma will open a pop-up, asking you for the configurations of the API Gateway trigger that you are about to define. Let's define a new API having API Name zipchamp, with a Resource Path /edit which accepts requests of type (Method) POST and routes them into our lambda. We also need to specify a Deployment Stage name, which could practically be anything (just an identifier for the currently active set of "versions" of the API components; we'll stick to Prod.

API Gateway pop-up configuration

Since we want to modify different files at different times, we would need to also include the path of the file in the payload. It would be easy if we can define our payload format at this point, to prevent possible omissions or confusions in the future.

    /* The request payload will take the following format:
    {
        "path": "path/to/zip/file/within/bucket",
        "changes": {
            "path/to/new/file/1": "content for file 1",
            "path/to/existing/file/2": "new content of file 2",
            "path/to/to-be-deleted/file/3": null
        }
    }
     */

Now, assuming a payload of the above format, we can start coding our magical zip-edit logic.

Planning out our mission:

  • fetch the file content from S3
  • open the content via JSZip
  • iterate over the entries (filenames with paths) in the changes field of our payload:
    • if the entry value is null, remove it from the archive
    • otherwise, add the entry value to the archive as a file (with name set to the entry key); which would be an update or an insertion depending on whether the file already existed in the archive (we could identify this difference as well, by slightly modifying the algorithm)
  • once the modifications are done, we could have uploaded the modified payload directly to S3, but the upload requires us to know the size of the upload in advance; unfortunately JSZip does not provide this yet. So we'd have to
    • save the modified zip file to the filesystem (/tmp), and
    • upload the file to S3, via a stream opened for the saved file, and specifying the Content-Length as the size of the saved file

Let's start by require-ing the JSZip dependency that we just added (along with fs, which we'll need real soon):

let AWS = require('aws-sdk');
let JSZip = require("jszip");
let fs = require("fs");

const s3 = new AWS.S3();

exports.handler = function (event, context, callback) {
    /* The request payload will take the following format:

And defining some variables to hold our processing state:

    }
     */

    let changes = event.changes;
    let modified = 0, removed = 0;

First, we need to retrieve the original file from S3, which requires an S3 operation. Drag an S3 entry into the editor, and configure it for a Get Object Operation. For the Bucket, you can either define a new bucket via the New Bucket tab (which would be created and managed by Sigma at deployment time, on your behalf - no need to go and create one in the S3 console!) or pick an already existing one from the Existing Bucket tab (handy if you already have a bucketful of archives to edit - or a default "artifacts" bucket where you host your archive artifacts).

configuring the 's3.getObject()' operation

Here I have:

  • used an existing Bucket, hosted-archives: the "stage" where I wish to perform all my zip-edit magic,
  • selected Get Object as the Operation, and
  • picked the path param from the original payload (event.path) as the S3 Object Key, @{event.path} (basically the path of the file inside the bucket, which we are interested in fetching); notice the @{} syntax in the pop-up field, which instructs Sigma to use the enclosed content as JS code rather than a constant string parameter (similar to ${} in JS).

If all is well, once you click the Inject button (well, it only gets enabled when all is well, so we're good there!), Sigma will inject an s3.getObject() code snippet - rich with some boilerplate stuff - right where you drag-dropped the S3 entity. As if that isn't enough, Sigma will also highlight the operation parameter block (first parameter of s3.getObject()), indicating that it has understood and is tracking that code snippet as part of its behind-the-scenes deployment magic! Pretty cool, right?

    s3.getObject({
        'Bucket': "hosted-archives",
        'Key': event.path
    }).promise()
        .then(data => {
            console.log(data);           // successful response
            /*
            data = {
                AcceptRanges: "bytes", 
                ContentLength: 3191, 
                ContentType: "image/jpeg", 
                ETag: "\\"6805f2cfc46c0f04559748bb039d69ae\\"", 
                LastModified: , 
                Metadata: {...}, 
                TagCount: 2, 
                VersionId: "null"
            }
            */
        })
        .catch(err => {
            console.log(err, err.stack); // an error occurred
        });

Sigma detects and highlights the 's3.getObject()' operation!

In the s3.getObject() callback, we can load the resulting data buffer as a zip file, via JSZip#loadAsync(buffer):

        .then(data => {
            let jszip = new JSZip();
            jszip.loadAsync(data.Body).then(zip => {

            });

Once the zip file is loaded, we can start iterating through our changes list:

  • If the value of the change entry is null, we remove the file (JSZip#remove(name)) from the archive;
  • Otherwise we push the content to the archive (JSZip#file(name, content)), which would correspond to either an insertion or modification depending on whether the corresponding entry (path) already exists in the archive.
            jszip.loadAsync(data.Body).then(zip => {
                Object.keys(changes).forEach(name => {
                    if (changes[name] !== null) {
                        zip.file(name, changes[name]);
                        modified++;
                    } else {
                        zip.remove(name);
                        removed++;
                    }
                });

We also track the modifications via two counters - one for additions/modifications and another for deletions.

Once the changes processing is complete, we are good to upload the magically transformed file back to S3.

  • As mentioned before, we need to first save the file to disk so that we can compute the updated size of the archive:
                    let tmpPath = `/tmp/${event.path}`
                    zip.generateNodeStream({ streamFiles: true })
                        .pipe(fs.createWriteStream(tmpPath))
                        .on('error', err => callback(err))
                        .on('finish', function () {
    
                        });
  • Time for the upload! Just like before, drag-drop a S3 entry into the 'finish' event of zip.generateNodeStream(), and configure it as follows:
    • Bucket: the same one that you picked earlier for s3.getObject(); I'll pick my hosted-archives bucket, and you can pick your own - just remember that, if you defined a new bucket earlier, you will have to pick it from the Existing Bucket list this time (because, from Sigma's point of view, the bucket is already defined; the entry will be prefixed with (New) to make your life easier).
    • Operation: Put Object
    • Object Key: @{event.path} (once again, note the @{} syntax; the Sigma way of writing ${})
    • Content of Object: @{fs.createReadStream(tmpPath)} (we reopen the saved file as a stream so the S3 client can read it)
    • Metadata: click the Add button (+ sign) and add a new entry pair: Content-Length = @{String(fs.statSync(tmpPath).size)} (reporting the on-disk size of the archive as the length of content to be uploaded)

configuring the 's3.putObject()' operation

                    .on('finish', function () {
                        s3.putObject({
                            "Body": fs.createReadStream(tmpPath),
                            "Bucket": "hosted-archives",
                            "Key": event.path,
                            "Metadata": {
                                "Content-Length": String(fs.statSync(tmpPath).size)
                            }
                        })
                            .promise()
                            .then(data => {

                            })
                            .catch(err => {

                            });
                    });
  • Lastly, let's add a successful callback invocation, to be fired when our S3 uploader has completed its job, indicating that our mission completed successfully:
                            .then(data => {
                                callback(null, {
                                    modified: modified,
                                    removed: removed
                                });
                            })
                            .catch(err => {
                                callback(err);
                            });

Notice the second parameter of the first callback, where we send a small "summary" of the changes done (file modifications and deletions). When we invoke the lambda via an HTTP request, we will receive this "summary" as a JSON response payload.

Throw in some log lines and error handling logic, until our lambda starts to look pretty neat:

let AWS = require('aws-sdk');
let JSZip = require("jszip");
let fs = require("fs");

const s3 = new AWS.S3();

exports.handler = function (event, context, callback) {
    /* The request payload will take the following format:
    {
        "path": "path/to/zip/file/within/bucket",
        "changes": {
            "path/to/new/file/1": "content for file 1",
            "path/to/existing/file/2": "new content of file 2",
            "path/to/deleted/file/3": null
        }
    }
     */

    let changes = event.changes;
    let modified = 0, removed = 0;

    console.log(`Fetching ${event.path}`);
    s3.getObject({
        'Bucket': "hosted-archives",
        'Key': event.path
    }).promise()
        .then(data => {
            let jszip = new JSZip();
            console.log(`Opening ${event.path}`);
            jszip.loadAsync(data.Body).then(zip => {
                console.log(`Opened ${event.path} as zip`);
                Object.keys(changes).forEach(name => {
                    if (changes[name] !== null) {
                        console.log(`Modify ${name}`);
                        zip.file(name, changes[name]);
                        modified++;
                    } else {
                        console.log(`Remove ${name}`);
                        zip.remove(name);
                        removed++;
                    }
                });

                let tmpPath = `/tmp/${event.path}`
                console.log(`Writing to temp file ${tmpPath}`);
                zip.generateNodeStream({ streamFiles: true })
                    .pipe(fs.createWriteStream(tmpPath))
                    .on('error', err => callback(err))
                    .on('finish', function () {
                        console.log(`Uploading to ${event.path}`);
                        s3.putObject({
                            "Body": fs.createReadStream(tmpPath),
                            "Bucket": "hosted-archives",
                            "Key": event.path,
                            "Metadata": {
                                "Content-Length": String(fs.statSync(tmpPath).size)
                            }
                        })
                            .promise()
                            .then(data => {
                                console.log(`Successfully uploaded ${event.path}`);
                                callback(null, {
                                    modified: modified,
                                    removed: removed
                                });
                            })
                            .catch(err => {
                                callback(err);
                            });
                    });
            })
                .catch(err => {
                    callback(err);
                });
        })
        .catch(err => {
            callback(err);
        });
}

The hard part is over!

(Maybe not that hard, since you hopefully enjoyed the cool drag-drop stuff; we sincerely hope you did!)

Now, simply click the Deploy Project button on the toolbar (or the menu item, if you like it that way) to set the wheels in motion.

Sigma will guide you through a few steps for getting your lambda up and running: committing your code to GitHub, building it, and displaying a summary of how it plans to deploy zipchamp to your AWS account via CloudFormation.

deployment summary

Once you hit Execute, Sigma will run the deployment gimmicks for you, and upon completion display the URL of the API endpoint - the trigger that we defined earlier, now live and waiting eagerly for your zip-edit requests!

deployment completed

Testing the whole thing is pretty easy: just fire up your favourite HTTP client (Postman, perhaps, sir?) And send a POST request to the deployment URL from above; with a payload conforming to our smartly-crafted zip-edit request format:

POST /Prod/edit HTTP/1.1
Host: yourapiurl.execute-api.us-east-1.amazonaws.com
Content-Length: 139

{
    "path": "my-dir/my-awesome-file.zip",
    "changes": {
        "conf/config.file": "property.one=value1\nproperty.two=value2"
    }
}

If you are a curl guy like me:

curl -XPOST --data '{
    "path": "my-dir/my-awesome-file.zip",
    "changes": {
        "conf/config.file": "property.one=value1\nproperty.two=value2"
    }
}' https://yourapiurl.execute-api.us-east-1.amazonaws.com/Prod/edit

If you would rather have zipchamp play with a test zip file than let it mess around with your premium content, you can use a simple zip structure like the following:

file.zip:
├── a/
│   ├── b = "bb"
│   └── d/
│       ├── e = "E"
│       └── f = "F"
└── c = ""

with a corresponding modification request:

{
    "path": "file.zip",
    "changes": {
        "b": "ba\nba\r\nblack\n\n\nsheep",
        "a/d/e": null,
        "a/c": "",
        "a/b": "",
        "a/d/f": "G"
    }
}

which, upon execution, would result in a modified zip file with the following structure:

file.zip:
├── a/
│   ├── b = ""
│   ├── c = ""
│   └── d/
│       └── f = "G"
├── b = "ba\nba\r\nblack\n\n\nsheep"
└── c = ""

Obviously you don't need to pull out your hair, composing the JSON request payload by hand; a simple piece of code, like the following Python snippet, will do just that on your behalf, once you feed it with the archive path (key) in S3, and the subpaths and content (local filesystem paths) of files that you need to update:

import json

payload = {
	"changes": {
		"file path 1 inside archive": open("file path 1 in local filesystem").read(),
		"file path 2 inside archive": open("file path 2 in local filesystem").read(),
		"file path inside archive, to be deleted": None
	},
	"path": "s3://temp-playground/cf-shell.zip"
}
print json.dumps(payload)

Nice! Now you can start using zipchamp to transform your long-awaited zip files, without having to download them anywhere, ever again!

Ah, and don't forget to spread the good word, trying out more cool stuff with Sigma and sharing it with your fellow devs!

Look, look! EI on LXC!

Containers have landed. And they are conquering the DevOps space. Fast.

containers: the future (https://blog.alexellis.io/content/images/2017/02/coding_stacks.jpg)

Enterprise Integration has been leading and driving the business success of organizations for ages.

enterprise integration: also the future

Now, thanks to AdroitLogic IPS, you can harness the best of both worlds - all the goodies of enterprise integration (EI), powered by the flexibility of Linux containers (LXC) - for your rapidly scaling, EI-hungry business and enterprise!

IPS for on-premise enterprise integration!

In case you had already looked at IPS, you might have noticed that it only offers a VirtualBox-based, single-instance evaluation distribution - not the most realistic way to try out a supposedly highly-available, highly-scalable integration platform.

But this time we have something "real" - something you can try out on your own K8s (or K8s-compatible; say OpenShift) cluster. It offers much better scalability - you can deploy our high-performance UltraESB-X instances across up to 20 physical nodes, in unlimited numbers (just ping us if you need more!) - and flexibility - the underlying K8s infra is totally under your control; for upgrades, patches, HA, DR and whatnot.

Kubernetes: the helmsman of container orchestration

OpenShift Container Platform (https://blog.openshift.com/wp-content/uploads/openshift<i>container</i>platform.png)

But the best part is that your UltraESB-X enterprise integrator instances would be running right inside your K8s network, meaning that you can easily and seamlessly integrate them with all your existing stuff: services (or microservices), queues, CI/CD pipelines, messaging systems, management and monitoring mechanisms, you name it.

UltraESB-X: enterprise integration - the easy way!

You can run the new IPS installer from any machine that has SSH access to the cluster - even from within the cluster itself. The process is fairly simple: just

If all goes well, the mission will go like this:

               ***************************************
                      Welcome to IPS Installer!
               ***************************************

Loading configurations...

++ MASTER=ubuntu@ip-1-2-3-4
++ NODES=(ubuntu@ip-5-6-7-8 ubuntu@ip-9-0-1-2)
++ SSH_ARGS='-i /path/to/aws/key.pem'
++ DB_IN_CLUSTER=true
++ DB_URL='jdbc:mysql://mysql.ips-system.svc.cluster.local:3306/ips?useSSL=false'
++ DB_USER=ipsuser
++ DB_PASS='7h1Zl$4v3RyI95e~CUr#*@m0R+'
++ DB_NODE=
++ ES_ENABLED=false
++ ES_IN_CLUSTER=true
++ ES_HOST=elasticsearch.ips-system.svc.cluster.local
++ ES_PORT=9300
++ ES_NODE=
++ DOCKER_REPO=adroitlogic
++ DOCKER_TAG=17.07.2-SNAPSHOT
++ set +x

Checking configurations...

NOTE: DB_NODE was not specified, defaulting to ip-5-6-7-8.
NOTE: ES_NODE was not specified, defaulting to ip-9-0-1-2.
Configurations checked. Looks good.

IPS will download required Docker images into your cluster
(~550 MB, or ~400 MB if you have disabled statistics).

Agree? yes


Starting IPS installation...

IPS needs to download the MySQL Java client library (mysql-connector-java-5.1.38-bin.jar)
in order to proceed with the installation.
Please type 'y' or 'yes' and press Enter if you agree.
If you are curious to know why we do it this way, check out
https://www.mysql.com/about/legal/licensing/oem/#3.

Agree? 

At this point you can either accept the proposal (obvious choice) or deny it (in which case the installer would fail).

Should you choose to accept it:

Agree? yes


Starting IPS installation...

Preparing ubuntu@ip-5-6-7-8...
Connection to ip-5-6-7-8 closed.
client.key.properties                                                                                                                                                            100%   53     0.1KB/s   00:00
license.conf.properties                                                                                                                                                          100%   78     0.1KB/s   00:00
license.key.properties                                                                                                                                                           100%   53     0.1KB/s   00:00
mysql-connector-java-5.1.38-bin.jar                                                                                                                                              100%  961KB 960.9KB/s   00:00
Successfully prepared ubuntu@ip-5-6-7-8

Preparing ubuntu@ip-9-0-1-2...
Connection to ip-9-0-1-2 closed.
client.key.properties                                                                                                                                                            100%   53     0.1KB/s   00:00
license.conf.properties                                                                                                                                                          100%   78     0.1KB/s   00:00
license.key.properties                                                                                                                                                           100%   53     0.1KB/s   00:00
mysql-connector-java-5.1.38-bin.jar                                                                                                                                              100%  961KB 960.9KB/s   00:00
Successfully prepared ubuntu@ip-9-0-1-2

configserver-rc.yaml                                                                                                                                                             100% 1960     1.9KB/s   00:00
configserver-svc.yaml                                                                                                                                                            100%  415     0.4KB/s   00:00
elasticsearch-rc.yaml                                                                                                                                                            100% 1729     1.7KB/s   00:00
elasticsearch-svc.yaml                                                                                                                                                           100%  418     0.4KB/s   00:00
ips-admin.yaml                                                                                                                                                                   100% 1093     1.1KB/s   00:00
ips-stats.yaml                                                                                                                                                                   100%  684     0.7KB/s   00:00
ipsweb-rc.yaml                                                                                                                                                                   100% 5023     4.9KB/s   00:00
ipsweb-svc.yaml                                                                                                                                                                  100%  399     0.4KB/s   00:00
mysql-rc.yaml                                                                                                                                                                    100% 1481     1.5KB/s   00:00
mysql-svc.yaml                                                                                                                                                                   100%  360     0.4KB/s   00:00
namespace "ips-system" created
namespace "ips" created
clusterrole "ips-node-stats" created
clusterrolebinding "ips-node-stats" created
clusterrole "ips-stats" created
clusterrolebinding "ips-stats" created
clusterrole "ips-admin" created
clusterrolebinding "ips-admin" created
replicationcontroller "mysql" created
service "mysql" created
replicationcontroller "configserver" created
service "configserver" created
replicationcontroller "ipsweb" created
service "ipsweb" created

IPS installation completed!
The IPS dashboard will be available at https://ip-5-6-7-8:30080 shortly.

You can always reach us at
    info@adroitlogic.com
or
    https://www.adroitlogic.com/contact/

Enjoy! :)

That's it! Time to fire up the dashboard and get on with it!

IPS: enterprise deployment on a single dashboard!

A few things to note, before you rush:

  • the hostnames/IP addresses used in config.sh should be the same as those being used as node names on the K8s side. Otherwise the IPS components may fail to recognize each other and the master. For now, an easy trick is to directly use the K8s node names for MASTER and NODES parameters (oh, and don't forget DB_NODE and ES_NODE!), and add host entries (maybe /etc/hosts on the installer machine) pointing those names to the visible IP addresses of the actual host machines; until we make things more flexible in the not-too-distant future.
  • Docker images for IPS management components will start getting downloaded on demand, as and when they are defined on the K8s side. Hence it may take some time for the system to stabilize (that is, before you can access the dashboard).
  • Similarly, the UltraESB-X Docker image will be downloaded on a worker node only when the first ESB instance gets scheduled on that node, meaning that you might observe slight delays during the first few ESB cluster deployments. If necessary, you can avoid this by manually running docker pull adroitlogic/ips-worker:17.07.2-SNAPSHOT on each worker node.

With the new distribution, we have also allowed you to customize the place where you store your IPS configurations (MySQL) and statistics (Elasticsearch): you can either set DB_IN_CLUSTER (or ES_IN_CLUSTER) to false and specify an external MySQL DB (or ES server) using DB_HOST, DB_PORT, DB_USER and DB_PASS (or ES_HOST and ES_PORT), or set it to true and specify a node name where MySQL (ES) should be deployed as an in-cluster pod, using DB_NODE (ES_NODE). Using an external MySQL or ES instance may be handy for cases where your cluster has limited resources (especially memory) and you want to maximally utilize them for running ESB instances rather than allocating them for infrastructure components.

customizable installation (https://d30y9cdsu7xlg0.cloudfront.net/png/128607-200.png)

Additionally, now you can also disable some non-essential features of IPS, such as statistics, at installation itself; just set ES_ENABLED to false, and IPS will skip the installation of an ES container and also stop the collection of ESB statistics at runtime. This can be really handy if you are running ESBs in a resource-constrained environment - disabling ES can bring down the per-ESB startup memory footprint from 700 MB right down to 250 MB! (We are already working on a leaner statistics collector based on the lean and sexy ES REST client - along with some other cool improvements - and once it is out, you will be able to run stats-enabled ESBs at under 150 MB memory.)

The new release is so new that we barely had the time to write the official docs for it - but all the existing docs, including the user guide and samples are all applicable to it, with some subtle differences:

  • The ESB Docker image (ips-worker) uses a different tag - a steaming hot 17.07.2-SNAPSHOT, instead of the default 17.07.
  • In samples, while you previously had to use the host-only address of the VM-based IPS node for accessing deployed services, now you can do it in a more "natural" way - by using the hostname of any node in the K8s cluster, just as you would do with any other K8s-based deployment.

For those of you who are starting from scratch, we have included a tiny guide to get you started with kubeadm - derived from the official K8s docs - that would kick-start you with a fully-managed K8s cluster within minutes, on your favourite environment (bare-metal or cloud). We also ship a TXT version inside the installer archive, in case you want to read it offline.

get started right away with kubeadm! (http://makeawebsite.org/wp-content/uploads/2014/12/quickstart-guide-icon.jpg)

And last but not the least, if you don't like what you see (although we're pretty sure that you will!), you can purge all IPS-related things from your cluster (:sad_face:) with another single command, teardown.sh:

++ MASTER=ubuntu@ip-1-2-3-4
++ NODES=(ubuntu@ip-5-6-7-8 ubuntu@ip-9-0-1-2)
++ SSH_ARGS='-i /path/to/aws/key.pem'
++ DB_IN_CLUSTER=true
++ DB_URL='jdbc:mysql://mysql.ips-system.svc.cluster.local:3306/ips?useSSL=false'
++ DB_USER=ipsuser
++ DB_PASS='7h1Zl$4v3RyI95e~CUr#*@m0R+'
++ DB_NODE=
++ ES_ENABLED=false
++ ES_IN_CLUSTER=true
++ ES_HOST=elasticsearch.ips-system.svc.cluster.local
++ ES_PORT=9300
++ ES_NODE=
++ DOCKER_REPO=adroitlogic
++ DOCKER_TAG=17.07.2-SNAPSHOT
++ set +x
Starting IPS tear-down...

Tearing down master...
namespace "ips-system" deleted
namespace "ips" deleted
clusterrole "ips-node-stats" deleted
clusterrole "ips-admin" deleted
clusterrole "ips-stats" deleted
clusterrolebinding "ips-node-stats" deleted
clusterrolebinding "ips-admin" deleted
clusterrolebinding "ips-stats" deleted
Successfully tore down master

Tearing down ubuntu@ip-5-6-7-8...
Connection to ip-5-6-7-8 closed.
Successfully tore down ubuntu@ip-5-6-7-8

Tearing down ubuntu@ip-9-0-1-2...
Connection to ip-9-0-1-2 closed.
Successfully tore down ubuntu@ip-9-0-1-2

IPS tear-down completed!

Enough talking, time to jump-start your integration - this time, on containers!