Sunday, April 22, 2018

Deploying your stuff with Google Cloud Deployment Manager: via NodeJS

This may not be the correct way; heck, this may be the crappiest way. I'm putting this up because I could not find a single decent sample on how to do it with JS.

The approach in this post uses NodeJS (server-side), but it is possible to do the same on the client side by loading the Google API client and subsequently the deploymentmanager v2 module; I'll write about it as well, if/when I get a chance.

First you set up authentication so your googleapis client can obtain a token automatically.

Assuming that you have added the googleapis:28.0.1 NPM module to your dependencies list, and downloaded your service account key into the current directory (where the deploymentmanager-invoking code is residing):

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

I used a service account, so YMMV.

If you like, you can cache the token at dev time by adding some more gimmicks: I used axios-debug-log to intercept the auth response and persist the token to a local file, from which I read the token during subsequent runs (if the token expires the JWT client will automatically refresh it, which I will then persist):

process.env.log = "axios";
const tokenFile = "./token.json";
require("axios-debug-log")({
    // disable extra logging
    request: function (debug, config) {},
    error: function (debug, error) {},
    response: function (debug, response) {
        // grab and save access token for reuse
        if (response.data.access_token) {
            console.log("Updating token");
            require("fs").writeFile(tokenFile, JSON.stringify(response.data));
        }
    },
});

// load saved token; if success, use OAuth2 client with loaded token instead of JWT client
// (avoid re-auth at each run)
try {
    const token = require(tokenFile);
    if (!token.access_token) {
        throw Error("no token found");
    }
    token.refresh_token = token.refresh_token || "refreshtoken";    //mocking
    console.log("Using saved tokens from", tokenFile);
    jwtClient.setCredentials(token);
} catch (e) {
    console.log(e.message);
}

Fair enough. Now to get the current state of the deployment:

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

const deployments = google.deploymentmanager("v2").deployments;

let fingerprint = null;

let deployment = deployments.get({
    project: projectId,
    deployment: deployment
})
    .then(response => {
        fingerprint = response.data.fingerprint;
        console.log("Fingerprint", fingerprint);
        return Promise.resolve(response);
    })
    .then(response => {
        // continue the logic
    });

The "fingerprint logic" is needed because we need to pass a "fingerprint" to every "write" (update (preview/start), stop, cancelPreview etc.) operation in order to guarantee in-order execution and operation synchronization.

That done, we set up an update for our deployment by creating a deployment preview (shell) within the last .then():

    .then(response => {
        console.log("Creating deployment preview", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                /* your resource definitions here; e.g.

                                {
                                    name: "myGcsBucket",
                                    type: "storage.v1.bucket",
                                    properties: {
                                        storageClass: "STANDARD",
                                        location: "US",
                                        labels: {
                                            "keyOne": "valueOne"
                                        }
                                    }
                                }
                                
                                and so on */
                            ]
                        }, 4, 2)
                    }
                }
            }
        })
            .catch(e => err("Failed to preview deployment", e))
    })

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

Notice that we passed fingerprint as part of the payload. Without it, Google would complain that it expected one.

But now, we again need to call deployments.get() because the fingerprint would have been updated! (Why the heck doesn't Google return the fingerprint in the response itself?!)

Maybe it's easier to just wrap the modification calls inside a utility code snippet:

const filter = {
    project: projectId,
    deployment: deployment
};

const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

// ...

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            // same payload from previous code block
        }))
            .catch(e => err("Failed to preview deployment", e))
    })

True, it's nasty to have a global fingerprint variable. You can pick your own way.

Meanwhile, if the initial deployments.get() fails due to a deployment being not found by the given name, we can create one (along with a preview) right away:

    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

Now let's keep on "monitoring" the preview until it reaches a stable state (DONE, CANCELLED etc.):

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

let timer = {
    handle: null
};
preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

            })
    }, timer, 5000);
});

And check if we reached a terminal (completion) state:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

// ...

            .then(response => {
                // ...

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;
                }
                return true;

If we reach a success state, we can commence the actual deployment:

                        // ...
                        } else {
                            deploy();
                        }

// ...

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            fingerprint: fingerprint,
            resource: {
                name: deployment
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

And start monitoring again, until we reach a completion state:

    // ...

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

Recap:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

let timer = {
    handle: null
};

const deployments = google.deploymentmanager("v2").deployments;

const filter = {
    project: projectId,
    deployment: deployment
};

let fingerprint = null;
const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                // your resource definitions here
                            ]
                        }, 4, 2)
                    }
                }
            }
        }))
            .catch(e => err("Failed to preview deployment", e))
    })
    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            deploy();
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;  // stop
                }
                return true;  // continue
            })
    }, timer, 5000);
});

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            resource: {
                name: deployment,
                fingerprint: fingerprint
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

That should be enough to get you going.

Good luck!

Serverless is the new Build Server: Google CloudBuild (Container Builder) via NodeJS

Google's CloudBuild (a.k.a. "Container Builder") is an on-demand, container-based build service offered under the Google Cloud Platform (GCP). For you and me, it is a nice alternative to maintaining and paying for our own build server, and a clever addition to anyone's CI stack.

CloudBuild allows you to start from a source (e.g. a Google Cloud Source repo, a GCS bucket - or perhaps even nothing (a blank directory; "scratch"), incrementally apply several Docker container runs upon it, and publish the final result to a desired location: like a Docker repository or a GCS bucket.

With its wide variety of custom builders, CloudBuild can do almost anything - that is, as far as I see so far, anything that can be achieved by a Docker container and a volume mount can be fulfilled in CloudBuild as well. To our great satisfaction, this includes fetching sources from GitHub/BitBucket repos (in addition to the native source location options), running custom commands like zip, and much more!

Above all this (how nice of GCP!), CloudBuild gives you 2 whole hours (120 minutes) of build time per day, for free - in comparison to the comparable CodeBuild service of AWS, which offers just 1 hour and 40 minutes per month!

So, now let's have a look at how we can run a CloudBuild via JS (server-side NodeJS):

First things first: adding googleapis:28.0.1 to our dependency list;

{
  "dependencies": {
    "googleapis": "28.0.1"
  }
}

Don't forget the npm install!

In our logic flow, first we need to get ourselves authenticated; with the google-auth-library module that comes with googleapis, this is quite straightforward because the client can be fed with a JWT auth client right from the beginning, which will handle all the auth stuff behind the scenes:

const projectId = "my-gcp-project-id";

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

Note that, for the above code to work verbatim, you need to place a service account key file in the current directory (usually obtained by creating a new service account via the Google Cloud console, in case you don't already have one).

Now we can simply retrieve the v1 version of the cloudbuild client from google, and start our magic:

const builds = google.cloudbuild("v1").projects.builds;

First we submit a build "spec" to the CloudBuild service. Below is an example for a typical NodeJS module on GitHub:

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "https://github.com/slappforge/slappforge-sdk", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "slappforge-sdk.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "slappforge-sdk.zip",
                    "gs://sdk-archives/slappforge-sdk/$BUILD_ID/slappforge-sdk.zip"
                ]
            }
        ]
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Basically we retrieve the source from GitHub, fetch the dependencies via a npm install, bundle the whole thing using a zip command container (took me a while to figure it out, which is why I'm posting this!) and upload the resulting zip to a GCS bucket.

We can tidy this up a bit (and perhaps make the template reusable for subsequent builds, by extracting out the parameters into a substitutions section:

const repoUrl = "https://github.com/slappforge/slappforge-sdk";
const projectName = "slappforge-sdk";
const bucket = "sdk-archives";

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "$_REPO_URL", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "$_PROJECT_NAME.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "$_PROJECT_NAME.zip",
                    "gs://$_BUCKET_NAME/$_PROJECT_NAME/$BUILD_ID/$_PROJECT_NAME.zip"
                ]
            }
        ],
        substitutions: {
            _REPO_URL: repoUrl,
            _PROJECT_NAME: projectName,
            _BUCKET_NAME: bucket
        }
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Once the build is started, we can monitor it like so (with a few, somewhat neat wrappers to properly manage the timer logic):

    .then(response => {
        let timer = {
            handle: null
        };

        startTimer(() => {
            return builds.get({
                projectId: projectId,
                id: response.data.metadata.build.id
            })
                .catch(e => {
                    throw e;
                })
                .then(response => {
                    const COMPLETE_STATES = ["SUCCESS", "DONE", "FAILURE", "CANCELLED"];
                    if (COMPLETE_STATES.includes(response.data.status)) {
                        return false;
                    }
                    return true;
                })
        }, timer, 5000);
    });

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

Once the build reaches a steady state, you are done!

If you want fine-grained details, just dig into response.data within the timer callback blocks.

Happy CloudBuilding!

Friday, April 20, 2018

Serverless: a no-brainer!

Few years ago, containers swept through the dev and devops lands like a category-6 hurricane.

Docker. Rkt. others.

Docker Swarm.

K8s.

OpenShift.

Right now we are literally at the epicenter, but when we glimpse at the horizon we see another one coming!

Serverless.

The funny thing is, "serverless" itself is a misnomer.

Of course there are servers. There are always servers. How can programs execute themselves in thin air, without the support of the underlying hardware and utility modules? So, there are servers.

Just not where you would expect them to be.

Traversing the timeline of computing, we see the turbulent track record?? of servers: birth in secret dungeons of vaccum tubes and city-scale power supplies; multi-ton boxes; networks; clusters; cloud datacenters and server farms (agriculture just lost its royalty!); containers.

Over time, we see servers losing their significance. Gradually, but steadily.

And now, suddenly, puff! They are gone.

Invisible, to be precise.

With serverless, you no longer care about the server. It may be a physical machine, a cloud VM, a K8s pod, an ECS container... heck, even an IoT rig.

Nobody cares, as long as the job gets done.

In this sense, we realize that serverless is nothing new; the concept, and even some pracical implementations, have been there since as far back as 2006. You yourself may have benefited from serverless (or conceptually serverless) architectures; while one may argue them to be PaaSes, Google App Engine and Google Apps Script (especially) are good examples from my Google-ridden "fungramming" history.

Just like touchscreens, serverless resemblances had always been there, but never has the marketing hype been this intense - obviously it is growing, and we'll surely see more of it as time flies by.

AWS had an early entry to the arena and currently owns a huge market share, bigger than all the others combined; Azure is behind, but catching up fast; and Google still seems to be more focused on Kubernetes and related containerization stuff although they too are on the track with Cloud Functions and Firebase.

Streaming and event-driven architectures are playing their part in bringing value to serverless. We should also not forget the cloud hype that made people go every-friggin-thing-as-a-service and later left them wondering how they could pay only for what they really use, only while they use it.

All ramblings aside, serverless is growing in popularity. Platforms are evolving to support more event sources, better integration support for other services and richer monitoring and statistics. Frameworks like Serverless are striving to provide a unified and generifier serverless development experience while IDEs like Sigma are doing their part in helping newbies (and sometimes even professionals) get going with serverless with minimum hassle and maximum speed.

Being new and shiny does not necessarily mean that serverless is the silver bullet for all your dev issues; in fact, right now it fits into only a few enterprise use cases (primarily due to the lack of strong guarantees, which are quite commonplace in the bureaucratic enterprise atmosphere). Nevertheless, providers are already working on this, and we can expect some disruptive - if not revolutionary - changes in the not-too-distant future. However it is always best to re-iterate your requirements before officially stepping into the serverless world, because serverless demands quite a shift in your application architecture, devops, as well as the very core of your developer mindset.

And, of course, the best way to pick the cake is to taste it yourself.

Sigma the Serverless IDE: resources, triggers, and heck, operations

With serverless, you stopped caring about the server.

With Sigma, you stopped (or will stop, if not already) about the platform.

Now all you care about is your code - the bliss of every programmer.

Or is it?

I hold her (the code) in my arms.

If you have done time with serverless frameworks, you would already know how they take away your platform-phobia, abstracting out the platform-specific bits of your serverless app.

And if you have already tried out Sigma, you would have noticed how it takes things further, relieving you of the burden of the configuration and deployment aspects as well.

Sigma for a healthier dev life!

Leaving behind just the code.

Just the beautiful, raw code.

So, what's the catch?

Okay. Now for the untold, unspoken, not-so-popular part.

You see, my friend, every good thing comes at a price.

Lucky for you, with Sigma, the price is very affordable.

Just a matter of sticking to a few ground rules while you develop your app. That's all.

Resources, resources.

All of Sigma's voodoo depends on one key thing: resources.

Resources, resources!

The concept is quite simple: every piece of your serverless app - may it be a DynamoDB table, S3 bucket or SNS topic - is a resource from Sigma's point of view.

If you remember the Sigma UI, the Resources pane on the left contains different resource types that you can have in your serverless app. (True, it's pretty short; but we're working on it :))

Resources pane in Sigma UI

Behind the scenes

When you drag a resource from this pane, into your code, Sigma secretly creates a resource (which it would later deploy into your serverless provider) to track the configurations of the actual service entity (say, the S3 bucket that should exist in AWS by the time your function is running) and all its usages within your code. The tracking is fully automated; frankly, you didn't even want to know about that.

Sigma tracks resources in your app, and deploys them into the underlying platform!

"New" or "existing"?

On almost all of Sigma's resource configuration pop-ups, you may have noticed two options: "new" vs "existing". "New" resources are the ones that would be (or have already been) created as a result of your project, whereas "existing" ones are those which have been created outside of your project.

Now that's a tad bit strange because we would usually use "existing" to denote things that "exist", regardless of their origin - even if they came from Mars.

Better brace yourself, because this also gives rise to a weirder notion: once you have deployed your project, the created resources (which now "exist" in your account) are still treated by Sigma as "new" resources!

And, as if that wasn't enough, this makes the resources lists in Sigma behave in totally unexpected ways; after you define a "new" resource, whenever you want to reuse that resource somewhere else, you would have to look for it under the "existing" tab of the resource pop-up; but it will be marked with a " (new)" prefix because, although it is already defined, it remains "new" from Sigma's point of view.

Now, how sick is that?!

Bang head here.

Perhaps we should have called them "Sigma" resources; or perhaps even better, "project" resources; while we scratch our heads, feel free to chip in and help us with a better name!

Rule o' thumb

Until this awkwardness is settled, the easiest way to get through this mess is to stick to this rule of thumb:


If you added a resource to your current Sigma project, Sigma would treat it as a "new" resource till the end of eternity.


Bottom line: no worries!

Being able to use existing resources is sometimes cool, but it means that your project would be much less portable. Sigma will always assume that the resources referenced by your project are already in existence, regardless of whatever AWS account you attempt to deploy it. At least until (if) we (ever) come up with a different resource management mechanism.


If you want portability, always stick to new resources. That way, even if a complete stranger gets hold of your project and deploys it in his own, alien, unheard-of AWS account, the project would still work.

If you are integrating with an already existing set of resources (e.g. the set of S3 bucket in your already-running dev/test/prod environment), using existing resources is the obvious (and the most convenient) choice.


Anyways, back to our discussion:

Where were we?

Ah, yes. Resources.

The secret life of resources

In a serverless app, you basically use resources for two things:

  • for triggering the app (as an event source, a.k.a. trigger)
  • for performing work inside the app, such as invoking external services

triggers and operations

Resources? Triggers?? Operations???

Sigma also associates its resources with your serverless app in a similar fashion:

In Sigma, a function can have several triggers (as long as the application itself is aware of tackling different trigger event types!), and can contain several operations (obviously).

Yet, they're different.

It is noteworthy that a resource itself is not a trigger or an operation; triggers and operations are associated with resources (they kind of "bridge" functions and resources) but a resource has its own independent life. As a result, a resource can power many triggers (to be precise, zero or more) and get involved in many operations, across many (again, zero or more) functions.

A good example is S3. If you want to write an image resizer function that would pick and process images dropped into a S3 bucket, you would configure a S3 trigger to invoke the function upon the file drop, and a S3 GetObject operation to retrieve and process the file; however, both will point to the same S3 resource, namely the bucket where images are being dropped into and fetched from.

Launch time!

At deployment, Sigma will take care of putting the pieces together - trigger configs, runtime permissions and whatnot - based on which function is associated with which resources, and in which ways (trigger-mode vs operation-mode). You can simply drag, drop and configure your buckets, queues and stuff, write your code, and totally forget about the rest!

That's the beauty of Sigma.

When a resource is "abandoned" (meaning that it is not used in any trigger or operation), it shows up in the "unused resources" list (remember the dustbin button on the toolbar?) and can be removed from the project; remember that if you do this, provided that the resource is a "new" one (rule of thumb: one created in Sigma), it will be automatically removed from your serverless provider account (for example, AWS) during your next deployment!

So there!

if Sigma's resource model (the whole purpose of this article) looks like a total mess-up to you, feel free to raise your voice on StackOverflow - or better still, our GitHub space, FB page or Twitter feed; we would appreciate it very much!

Of course, Sigma has nothing to hide; if you check your AWS account after a few Sigma deployments, you would realize the things we have been doing under the hood.

All of it, to make your serverless journey as smooth as possible.

And easy.

And fun. :)

Welcome to the world of Sigma!

Thursday, April 19, 2018

Sigma QuickBuild: Towards a Faster Serverless IDE

TL;DR

The QuickBuild/QuickDeploy feature described here is pretty much obsoleted by the test framework (ingeniously hacked together by @CWidanage), that gives you a much more streamlined dev-test experience with much better response time!


In case you hadn't noticed, we have recently been chanting about a new Serverless IDE, the mighty SLAppForge Sigma.

With Sigma, developing a serverless app becomes as easy as drag-drop, code, and one-click-Deploy; no getting lost among overcomplicated dashboards, no eternal struggles with service entities and their permissions, no sailing through oceans of docs and tutorials - above all that, nothing to install (just a web browser - which you already have!).

So, how does Sigma do it all?

In case you already tried Sigma and dug a bit deeper than just deploying an app, you may have noticed that it uses AWS CodeBuild under the hood for the build phase. While CodeBuild gives us a fairly simple and convenient way of configuring and running builds, it has its own set of perks:

  • CodeBuild takes a significant time to complete (sometimes close to a minute). This may not be a problem if you just deploy a few sample apps, but it can severely impair your productivity - especially when you begin developing your own solution, and need to reflect your code updates every time you make a change.
  • The AWS Free Tier only includes 100 minutes of CodeBuild time per month. While this sounds like a generous amount, it can expire much faster than you think - especially when developing your own app, in your usual trial-and-error cycles ;) True, CodeBuild doesn't cost much either ($0.005 per minute of build.general1.small), but why not go free while you can? :)

Options, people?

Lambda, on the other hand, has a rather impressive free quota of 1 million executions and 3.2 million seconds of execution time per month. Moreover, traffic between S3 and Lambda is free as far as we are concerned!

Oh, and S3 has a free quota of 20000 reads and 2000 writes per month - which, with some optimizations on the reads, is quite sufficient for what we are about to do.

2 + 2 = ...

So, guess what we are about to do?

Yup, we're going to update our Lambda source artifacts in S3, via Lambda itself, instead of CodeBuild!

Of course, replicating the full CodeBuild functionality via a lambda would need a fair deal of effort, but we can get away with a much simpler subset; read on!

The Big Picture

First, let's see what Sigma does when it builds a project:

  • prepare the infra for the build, such as a role and an S3 bucket, skipping any that already exist
  • create a CodeBuild project (or, if one already exists, update it to match the latest Sigma project spec)
  • invoke the project, which will:
    • download the Sigma project source from your GitHub repo,
    • run an npm install to populate its dependencies,
    • package everything into a zip file, and
    • upload the zip artifact to the S3 bucket created above
  • monitor the project progress, and retrieve the URL of the uploaded S3 file when done.

And usually every build has to be followed by a deployment; to update the lambdas of the project to point to the newly generated source archive; and that means a whole load of additional steps!

  • create a CloudFormation stack (if one does not exist)
  • create a changeset that contains the latest updates to be published
  • execute the changeset, which will, at the least, have to:
    • update each of the lambdas in the project to point to the new source zip file generated by the build, and
    • in some cases, update the triggers associated with the modified lambdas as well
  • monitor the stack progress until it gets through with the update.

All in all, well over 60-90 seconds of your precious time - all to accommodate perhaps just one line (or how about one word, or one letter?) of change!

Can we do better?

At first glance, we see quite a few redundancies and possible improvements:

  • Cloning the whole project source from scratch is overkill, especially when only a few lines/files have changed.
  • Every build will download and populate the NPM dependencies from scratch, consuming bandwidth, CPU cycles and build time.
  • The whole zip file is now being prepared from scratch after each build.
  • Since we're still in dev, running a costly CF update for every single code change doesn't make much sense.

But since CodeBuild invocations are stateless and CloudFormation's resource update logic is mostly out of our hands, we don't have the freedom to meddle with many of the above; other than simple improvements like enabling dependency caching.

Trimming down the fat

However, if we have a lambda, we have full control over how we can simplify the build!

If we think about 80% - or maybe even 90% - of the cases for running a build, we see that they merely involve changes to application logic (code); you don't add new dependencies, move your files around or change your repo URL all the time, but you sure as heck would go through an awful lot of code edits until your code starts behaving as you expect it to!

And what does this mean for our build?

80% - or even 90% - of the time, we can get away by updating just the modified files in the lambda source zip, and updating the lambda functions themselves to point to the updated file!

Behold, here comes QuickDeploy!

And that's exactly what we do, with the QuickBuild/QuickDeploy feature!

Lambda to the rescue!

QuickBuild uses a lambda (deployed in your own account, to eliminate the need for cross-account resource access) to:

  • fetch the latest CodeBuild zip artifact from S3,
  • patch the zip file to accommodate the latest code-level changes, and
  • upload the updated file back to S3, overriding the original zip artifact

Once this is done, we can run a QuickDeploy which simply sends an UpdateFunctionCode Lambda API call to each of the affected lambda functions in your project, so that they can scoop up the latest and greatest of your serverless code!

And the whole thing does not take more than 15 seconds (give or take the network delays): a raw 4x improvement in your serverless dev workflow!

A sneak peek

First of all, we need a lambda that can modify an S3-hosted zip file based on a given set of input files. While it's easy to make with NodeJS, it's even easier with Python, and requires zero external dependencies as well:

Here we go... Pythonic!

import boto3

from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED

s3_client = boto3.client('s3')

def handler(event, context):
  src = event["src"]
  if src.find("s3://") > -1:
    src = src[5:]
  
  bucket, key = src.split("/", 1)
  src_name = "/tmp/" + key[(key.rfind("/") + 1):]
  dst_name = src_name + "_modified"
  
  s3_client.download_file(bucket, key, src_name)
  zin = ZipFile(src_name, 'r')
  
  diff = event["changes"]
  zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
  
  added = 0
  modified = 0
  
  # files that already exist in the archive
  for info in zin.infolist():
    name = info.filename
    if (name in diff):
      modified += 1
      zout.writestr(info, diff.pop(name))
    else:
      zout.writestr(info, zin.read(info))
  
  # files in the diff, that are not on the archive
  # (i.e. newly added files)
  for name in diff:
    info = ZipInfo(name)
    info.external_attr = 0755 << 16L
    added += 1
    zout.writestr(info, diff[name])
  
  zout.close()
  zin.close()
  
  s3_client.upload_file(dst_name, bucket, key)
  return {
    'added': added,
    'modified': modified
  }

We can directly invoke the lambda using the Invoke API, hence we don't need to define a trigger for the function; just a role with S3 full access permissions would do. (We use full access here because we would be reading from/writing to different buckets at different times.)

CloudFormation, you beauty.

From what I see, the coolest thing about this contraption is that you can stuff it all into a single CloudFormation template (remember the lambda command shell?) that can be deployed (and undeployed) in one go:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  zipedit:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: zipedit
      Handler: index.handler
      Runtime: python2.7
      Code:
        ZipFile: >
          import boto3
          
          from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED
          
          s3_client = boto3.client('s3')
          
          def handler(event, context):
            src = event["src"]
            if src.find("s3://") > -1:
              src = src[5:]
            
            bucket, key = src.split("/", 1)
            src_name = "/tmp/" + key[(key.rfind("/") + 1):]
            dst_name = src_name + "_modified"
            
            s3_client.download_file(bucket, key, src_name)
            zin = ZipFile(src_name, 'r')
            
            diff = event["changes"]
            zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
            
            added = 0
            modified = 0
            
            # files that already exist in the archive
            for info in zin.infolist():
              name = info.filename
              if (name in diff):
                modified += 1
                zout.writestr(info, diff.pop(name))
              else:
                zout.writestr(info, zin.read(info))
            
            # files in the diff, that are not on the archive
            # (i.e. newly added files)
            for name in diff:
              info = ZipInfo(name)
              info.external_attr = 0755 << 16L
              added += 1
              zout.writestr(info, diff[name])
            
            zout.close()
            zin.close()
            
            s3_client.upload_file(dst_name, bucket, key)
            return {
                'added': added,
                'modified': modified
            }
      Timeout: 60
      MemorySize: 256
      Role:
        Fn::GetAtt:
        - role
        - Arn
  role:
    Type: AWS::IAM::Role
    Properties:
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
        - Action: sts:AssumeRole
          Effect: Allow
          Principal:
            Service: lambda.amazonaws.com

Moment of truth

Once the stack is ready, we can start submitting our QuickBuild requests to the lambda!

// assuming auth stuff is already done
let lambda = new AWS.Lambda({region: "us-east-1"});

// ...

lambda.invoke({
  FunctionName: "zipedit",
  Payload: JSON.stringify({
    src: "s3://bucket/path/to/archive.zip",
    changes: {
      "path/to/file1/inside/archive": "new content of file1",
      "path/to/file2/inside/archive": "new content of file2",
      // ...
    }
  })
}, (err, data) => {
  let result = JSON.parse(data.Payload);
  let totalChanges = result.added + result.modified;
  if (totalChanges === expected_no_of_files_from_changes_list) {
    // all izz well!
  } else {
    // too bad, we missed a spot :(
  }
});

Once QuickBuild has completed updating the artifact, it's simply a matter of calling UpdateFunctionCode on the affected lambdas, with the S3 URL of the artifact:

lambda.updateFunctionCode({
  FunctionName: "original_function_name",
  S3Bucket: "bucket",
  S3Key: "path/to/archive.zip"
})
.promise()
.then(() => { /* done! */ })
.catch(err => { /* something went wrong :( */ });

(In our case the S3 URL remains unchanged (because our lambda simply overwrites the original file), but it still works because the Lambda service makes a copy of the code artifact when updating the target lambda.)

To speed up the QuickDeploy for multiple lambdas, we can even parallelize the UpdateFunctionCode calls:

Promise.all(
  lambdaNames.map(name =>
    lambda.updateFunctionCode({ /* params */ })
    .promise()
    .then(() => { /* done! */ }))

.then(() => { /* all good! */ })
.catch(err => { /* failures; handle them! */ });

And that's how we gained an initial 4x improvement in our lambda deployment cycle, sometimes even faster than the native AWS Lambda console!