Friday, May 25, 2018

Hola, AS2! (a.k.a. Your First AS2 Message, with AS2Gateway)

AS2, despite being the most popular secure B2B communication protocol, is fairly complicated for a beginner, regardless of how mature he may be in the actual business domain. Not to mention the various operational caveats of maintaining an in-house AS2 server.

That is exactly why we built AS2Gateway, making it super-easy for even a five-year-old to send AS2 messages to any partner in the other corner of the world... or so we boast.

AS2Gateway: AS2 for all!

Hmm... wait... is it actually that easy?

(Are you kidding me? Partners, stations, certificates, encryption certificates, sign certificates, cert chains, async MDN, signed MDN,... sounds like ancient Greek to me! Where am I even supposed to start?)

Relax. Let's go one step at a time.

(Okay.)

Vocabulary first.

AS2 is all about sending and receiving messages (documents; POs, invoices, shipments, you name it) securely, between "partners" (yup, business partners). The format or content of the messages (EDI, X12, ebXML, ... heck, even plaintext, images or videos) doesn't matter; as long as it's a file, AS2 can deliver it, and produce a receipt (MDN) as well (so the sender can be sure that the recipient got it, securely and untampered).

Secure communication

(So there we have partners. But how about me? Am I a partner too?)

Yes you are, from the viewpoint of other partners. But we love you and want you to stand out from the crowd, so in AS2G, you are a station, not a plain old partner. You could have multiple stations (and hence, multiple identities) for trading with different partners. Messages received from partners to your different "identities" would appear under the corresponding stations. Just like having multiple mail accounts configured on the same browser.

(Alright, so I'm a station. Whatever. I don't care. How do I send a message to a partner?)

Well, you just

  • click the New Message button (rocket) on any page,

    'New Message' button

  • pick the intended partner,
  • pick your station,
  • key in a message subject,
  • upload or drag-n-drop the files, and

    Your AS2 message, ready to send!

  • click Send.

Just like in mail, except that you don't have to type a message body, but do have to select your own identity (station); in case of mail, the latter you already did (kind of) when you picked your mail account.

That is, if you have already set up your partner.

(Aha, so that's the catch! How do I set up a partner?)

Well, for that, we gotta go back to the vocabulary lesson—or more precisely, dive a bit into the world of AS2.

As you may already know, AS2 is all about security and reliability in document transfer. To achieve this, AS2 softwares have to perform quite a bit on behalf of you:

  • for outgoing messages: encrypting and signing, and compressing the content
  • for received messages: decrypting, decompressing and signature verification, generating (and optionally signing) the receipt if one has been requested

While AS2G is already a champion in all this stuff, it still needs to know the expectations of your partner (the "partner configuration") in order to perform them. That is why you need to head over to the partners page (ensure you are logged in before clicking the URL!), click the New Partner button at the top, and carefully fill up the resulting form.

(Huh, you sound like it's a piece of cake! If I knew how to fill in that form, why the heck would I be reading your article right now?!)

Calm down folks, almost there. Let's go field-by-field. Except the partner name, type and description, everything else in the form should be provided by your partner, so it basically boils down to bothering your partner until he gives up all pieces of the puzzle to fill the blanks in the form.

  • Name is, as you have guessed, the name you would like to call your partner (just for display and auditing purposes).
  • Partner Type is usually Production, unless you specifically want to do kind of a "test bed" configuration; if you pick Test you would have to make further changes when you are ready to go for production (for actual business messages), so it's better to stick to Production.
  • AS2 Identifier is an AS2 protocol-level magic word that uniquely identifies your partner. Your partner should provide this to you during the setting-up stages, just like you should send your AS2 ID to him/her; chill out, it's just a piece of text, and maybe digits (alphanumeric, in tech jargon)—you only have to ensure that no other partners of your partner are using the same token, because otherwise your partner may mix up the messages sent by you, with those from other partners.
  • Description is, again, just a blurb of text for your convenience in picking your partners apart.

Okay, that was easy enough. Now the geeky stuff.

  • URI is the HTTP/S address at which your partner is accepting incoming AS2 messages. (If this happens to be HTTPS (i.e. begins with https://), you have a bit more work coming up later, in the Advanced Options section!)
  • Encryption Certificate is a file (containing a "key"—a public key, to be exact) provided by your partner, which you can use to "unlock" (decrypt) messages received from him. Before sending, your partner will "lock" his messages with his own private key, and they can only be "unlocked" by the above public key. Since the latter can only decrypt messages encrypted by the former, it ensures that the message did actually originate from your partner, and guarantees that no evil force would be able to make any sense of the message while it is in transit (since it is "locked" (encrypted)).
  • Use different certificate as sign certificate is a bit of an advanced option which allows your partner to use a different "key" (file) to generate a "signature" of the message content (which is also sent to you, along with the message, for verification purposes). If your partner does not provide this, he would probably be using the same certificate (let's call it "cert", okay?) for encryption and signing; if he does provide one, just upload it here. Either way, not much to worry about!
  • Sign/Encrypt Chain Certificates are for establishing trust; to indicate that the certs provided by your partner actually do belong to that partner. (Otherwise, a third party could impersonate your partner and send you their certs and details, and capture all your messages that you (believe you) are sending to your partner!) If you don't get it, not to worry; if your partner provides these files, just upload them here!
  • Message Subject, again, is a down-to-earth field that specifies a default subject for messages you send out to this partner (so you don't have to tire your fingers typing it every time).

In some cases you may be able to get away with the above lot; however, if your partner mentions anything about:

  • encryption algorithm,
  • signing or digest algorithm,
  • sign or digest encryption,
  • compression,
  • async MDN or signed MDN,

it may be time to open up that Advanced Options section as well :(

Here we see four main "categories" of options (although there is no visual demarcation), the first being encryption:

  • Encrypt Messages indicates whether your partner expects you to "lock" (encrypt) messages that you are sending out to him; just like the ones he is sending to you. This time, you will have to send your own "key" (public cert) to your partner so that he can "unlock" your messages; more on that later!
  • Encryption Algorithm is the locking "mechanism" that you (AS2G, to be exact) will be using for messages being sent out to your partner—as requested by your partner, of course.

Next comes signing:

  • Sign Messages, similarly, indicates whether your partner expects a "signature" (a verifiable, summarized blurb representing the payload) along with each of your messages.
  • Sign Digest Algorithm should be set to your partner's preferred mechanism for producing the abovementioned "signature blurb".
  • Sign Encrypt Algorithm is the locking mechanism that your partner expects you to use upon the abovementioned signature (as opposed to the payload that was mentioned previously). With this, both your message and its signature would be in encrypted form during transit.

And compression:

  • Compress the messages before signing should be enabled only if your partner likes to see the incoming message in compressed form (primarily to save network overhead and speed up the transfer).
  • Compress the messages after signing is similar to above, but applies to both the message (which may already have been compressed!) and its signature combined.

And, finally (phew!), MDN:

  • Request MDN decides if you should demand your partner to send back a "receipt" (disposition notification) for every message that you send. The MDN mechanism in AS2 is cleverly crafted such that being able to receive and process an MDN guarantees that the message exchange was successful.
  • Request Asynchronous MDN, if you have enabled the above option, indicates that your partner should send back the MDN in a separate channel (i.e. you send your message in one path (HTTP request), to which the partner simply acknowledges; later on, your partner generates and sends the MDN back to you in a newly initiated path (new HTTP request; to which you send another acknowledgement).
  • Request Digitally Signed MDN demands your partner to send a signature along with the MDN, just like you (might have) signed the message itself when sending it to him.

Okay, take a deep breath. Count from one to twenty-five. Whatever. Just keep calm.

To clarify things a bit, let's have a look at an example configuration of a trading partner:

  • called My Awesome Partner
  • with AS2 ID AWSMPTNR,
  • who accepts messages at the URL http://as2.acme.com/as2/inbound

where the messages need to be:

  • compressed, right at the beginning (before signing/encryption),
  • encrypted with AES-256-CBC using the AWSMPTNR-encrypt.pem cert,
  • signed with SHA-256 using the same cert,
  • and the signature itself encrypted again, using RSA, and
  • compressed again (after signing and encryption),

and you wish the partner to:

  • send back a receipt (MDN), which is
  • asynchronous and
  • signed

Applying the above barricade of rules, our partner would appear as follows:

'AWSMPTNR' Partner Configuration, Episode 1: Pilot

'AWSMPTNR' Partner Configuration, Episode 2: Continuation

'AWSMPTNR' Partner Configuration, Episode 3: Advanced Options

Here we have requested our awesome partner to reply with a signed, async MDN (receipt) for every message that we send him.

If you assume, for a moment, that AWSMPTNR later happens to need to use a HTTPS URL for receiving messages:

the top (HTTPS-related) portion of your Advanced Options section may look like:

'AWSMPTNR' Partner Configuration, Bonus Episode 1: HTTPS

You don't trust this third-party guy who signed your partner's HTTPS cert, but the third party has its own cert signed (verified) by another CA that you do trust—which forms a trust chain wherein you trust the third party's cert because it is signed by a known CA, and in turn you trust your partner's cert because it has been signed by someone that you do trust now.

Trust chains. Now, how sick is that?!

Going a bit further, what would you see if your partner starts using a different sign certificate AWSMPTNR-sign.pem, verified by a third party verified by another third (fourth!) party verified by another (fifth!) party; where you (luckily) trust the last one?

'AWSMPTNR' Partner Configuration, Bonus Episode 2: Sign Cert, Bound in Chains

In case you lost it entirely, those two certs uploaded under Sign/Encrypt Chain Certificates are the "third" and "fourth" parties, which helps AS2G build the trust chain from the top-level trusted CA all the way down to your partner's sign cert.

If the above stuff seems totally Greek (or Chinese, gibberish, whatever) to you (and rightly so!), feel free to consult AdroitLogic for getting things in order; or, if you feel adventurous, go ahead and mess around with the switches and drop-downs your own :) It's highly unlikely that you're going to break anything!

Phew, finally we have our partner in all his glory, ready to roll.

With me so far? :)

If not, just take a deep breath and read through the stuff again, slowly and carefully (maybe skipping the techy innards, if you feel better that way); it would eventually make sense.

Though we have our partner, we don't have a station yet. So let's go ahead and create one.

Head over to the station page and click the New Station button.

Your secret passage to creating a new Station

You'll end up in another form; just like the partner configuration thing, but much simpler.

(I knew it... Another form. How many more of those until I get to actually send a message?)

Just this one, I promise.

Here you can define how your partners should send messages to you. This includes:

  • Name: again, just a name for your convenience.
  • AS2 Identifier: a unique identifier for yourself, just like in the case of a partner, kind of like your own NIC or Social Security Number.
  • Email: the email address you would like to associate with this station; just for email notifications of incoming messages.
  • Enable Email Notifications: by default AS2G will notify you via email as soon as your station receives a new message from one of your partners; you can disable this feature using this switch if it becomes too annoying.
  • Description: again, just a blurb of text for your convenience

Now begins the (thankfully, much shorter) list of geeky stuff, primarily for defining your station's keystore (where AS2G would be storing all the keys and certificates that allow you to communicate securely with your partners; similar to what you saw in the partner configurations):

  • Key Store: here, AS2G allows you to either
  1. generate a new keystore, on the house! (whose fragments you can later share with your partner)—obviously the easiest option!
  2. upload an existing keystore from your computer—handy if you are migrating stuff from a different station (or an entirely different AS2 platform) but don't want to bother your partner; i.e. your partner will see you in exactly the same way as he did before, although you are now using a new station! (Note, however, that this won't work if your new station's name differs from that of the old station :()
  3. select one that has already been uploaded, and is available in your certificate store right now.

The last two options are straightforward (upload a file, or pick one from a list), so let's have a closer look at the first one (which you would be picking anyways). You would have to fill in some details of the newly generated keypair for your brand new keystore, which are standard to the X.509 format (now, now, no need to google it!).

  • Common Name: This is the only mandatory field; put in either your own name, or maybe your station's name?
  • Organization Unit and Organization: kinda like your department, and the parent organization name
  • City, State, Country: address (you get it)

Now the geeky stuff (again!):

  • Key Length: the size of the key; generally the bigger the key, the more secure your encrypted message is—and the slower the encryption/decryption process would be. Generally 1024 or 2048 would suffice.
  • Keystore Password: a password for the new keystore, so you can inspect it later on if required
  • Certificate Password: a password for your key itself (private keys need to be stored with great care!)

'AWSMSTTN' Station Configuration, Episode 1: Pilot

'AWSMSTTN' Station Configuration, Episode 2: Finale

Now you have configured your side of the bargain. But your partner also needs to configure a "partner" on his own system, that corresponds to you (since you, after all, are his business partner). Hence you need to somehow pass your details through to him, so that his partner configuration is compatible with your station configuration.

Your new station, from a partner's-eye view

Now, how do you do that?

Lucky for you, AS2G can do it for you!

(Really? Finally, something cool.)

Once you open your brand new station from the Stations page, enter the email address of your partner in the Share this configuration text field, and click Send. Your partner will receive your station configuration via email, and can configure your partner profile on his side accordingly.

Go ahead, click 'Send' to pass the ball to your partner!

Now follows another wait (hopefully not too long!) until your partner confirms that he has added your info to his AS2 system; just like we did for him. Yup, patience is a virtue.

Now, for the first time in history, you are ready to send your first message!

Go ahead, click that rocket icon, pick your partner, select your files and click Send!

From this point onwards, AS2G is as simple as that!

For a brief time, you may see your message in a "queued" (pending) state:

Your first message, queued, waiting impatiently to be picked up

Usually, however, AS2G will pick it up and sends it out within seconds:

Yay, she's on her way!

In the unlikely (Or likely, maybe? After all, this is your first message!) event that your message decides to play a trick on you and fails to be sent out, you will see under the "Failed" tab of the message view, along with the failure cause:

Okay, it failed; but not to worry!

In that case, you can open up the failed message and use the various controls to navigate through its metadata and failure details.

Detailed message view with diagnostic info

Oh, and did I mention that you can contact the AS2G Team anytime to learn more details, and more importantly, about how to avoid the failure in the future?

Contact AS2Gateway Team!

Now that you know the basics, things are going to get easier, way more useful, and way much fun!

Got a new partner? Add a new one on the Partners page, share your station details with him, and pick the new partner from the Partner drop-down when sending your messages!

Partner not happy with your station config? Just create a new station with different, more partner-friendly configs, share it with your partner, and select the new station from the Station drop-down when sending to that partner!

See all your inbound and outbound messages in your message view, along with corresponding acknowledgements (MDNs): just like your mailbox!

Receipt (MDN) for an outbound message

Attachments of a received message

Whenever you need to analyze a message in depth, or troubleshoot a failure, just contact the AS2G Team—or, if you feel adventurous, dive down into the details yourself, with the abovementioned detailed message view including transport-level headers, error metadata and retry information.

Message controls: dive down, right into the details!

Transport headers of a received message: what gets sent over the wire (or maybe wireless)!

AS2G has a lot more in store for you; so, my dear folks, it's time to get going!

Monday, May 14, 2018

How to rob a bank: no servers - just a ballpoint pen!

Okay, let's face it: this article has nothing to do with robbery, banks or, heck, ballpoint pens; but it's a good attention grabber (hopefully!), thanks to Chef Horst of Gusteau's. (Apologies if that broke your heart!)

Rather, this is about getting your own gossip feed—sending you the latest and the hottest, within minutes they become public—with just an AWS account and a web browser!

Maybe not as exciting as a bank robbery, but still worth reading on—especially if you're a gossip fan and like to always have an edge over the rest of your buddies.

Kicking out the server

Going with the recent hype, we will be using serverless technologies for our mission. You guessed it, there's no server involved. (But, psst, there is one!)

Let's go with AWS, which offers an attractive Free Tier in addition to a myriad of rich serverless utilties: CloudWatch scheduled events to trigger our gossip seek, DynamoDB to store gossips and track changes, and SNS-based SMS to dispatch new gossips right into your mobile!

And the best part is: you will be doing everything—from defining entities and composing lambdas to building, packaging and deploying the whole set-up—right inside your own web browser, without ever having to open up a single tedious AWS console!

All of it made possible thanks to Sigma, the brand new truly serverless IDE from SLAppForge.

Sigma: Think Serverless!

The grocery list

First things first: sign up for a Sigma account, if you haven't already. All it takes is an email address, AWS account (comes with that cool free tier, if you're a new user!), GitHub account (also free) and a good web browser. We have a short-and-sweet writeup to get you started within minutes; and will probably come up with a nice video as well, pretty soon!

A project is born

Once you are in, create a new project (with a catchy name to impress your buddies—how about GossipHunter?). The Sigma editor will create a template lambda for you, and we can start right away.

GossipHunter at t = 0

Nurtured with <3 by NewsAPI

As my gossip source, I picked the Entertainment Weekly API by newsapi.org. Their API is quite simple and straightforward, and a free signup with just an email address gets you an API key with 1000 requests per day! In case you have your own love, feel free to switch just the API request part of the code (coming up soon!), and the rest should work just fine!

The recipe

Our lambda will be periodically pulling data from this API, comparing the results with what we already know (stored in DynamoDB) and sending out SMS notifications (via SNS) to your phone number (or email, or whatever other preferred medium that SNS offers) for any already unknown (hence "hot") results. We will store any newly seen topics in DynamoDB, so that we can prevent ourselves from sending out the same gossip repeatedly.

(By the way, if you have access to a gossip API that actually emits/notifies you of latest updates (e.g. via webhooks) rather than us having to poll for and filter them, you can use a different, more efficient approach such as configuring an API Gateway trigger and pointing the API webhook to the trigger endpoint.)

Okay, let's chop away!

The wake-up call(s)

First, let's drag a CloudWatch entry from the left Resources pane and configure it to fire our lambda; to prevent distractions during working hours, we will configure it to run every 15 minutes, only from 7 PM (when you are back from work) to midnight, and from 5AM to 8 AM (when you are on your way back to work). This can be easily achieved through a New, Schedule-type trigger that uses a cron expression such as 5-7,19-23 0/15 ? * MON-FRI *. (Simply paste 0/15 , 5-7,19-23 (no spaces) and MON-FRI into the Minutes, Hours and Day of Week fields, and type a ? under Day of Month.)

CloudWatch Events trigger: weekdays

But wait! The real value of gossip is certainly in the weekend! So let's add (drag, drop, configure) another trigger to run GossipHunter all day (5 AM - midnight!) over the weekend; just another cron with 0/10 (every ten minutes this time! we need to be quick!) in Minutes, 5-23 in Hours, ? in Day of Month and SAT,SUN in Day of Week.

CloudWatch Events trigger: weekends

Okay, time to start coding!

Grabbing the smoking hot stuff

Let's first fetch the latest gossips from the API. The requests module could do this for us in a heartbeat, so we'll go get it: click the Add Dependency button on the toolbar, type in requests and click Add once our subject appears in the list:

'Add Dependency' button

Now for the easy part:

  request.get(`https://newsapi.org/v2/top-headlines?sources=entertainment-weekly&apiKey=your-api-key`,
  (error, response, body) => {

    callback(null,'Successfully executed');
  })

Gotta hide some secrets?

Wait! The apiKey parameter: do I have to specify the value in the code? Since you probably would be saving all this in GitHub (yup, you guessed right!) won't that compromise my token?

We also had the same question; and that's exactly why, just a few weeks ago, we introduced the environment variables feature!

Go ahead, click the Environment Variables ((x)) button, and define a KEY variable (associated with our lambda) holding your API key. This value will be available for your lambda at runtime, but it will not be committed into your source; you can simply provide the value during your first deployment after opening the project. And so can any of your colleagues (with their own API keys, of course!) when they get jealous and want to try out their own copy of your GossipHunter!

Defining the 'KEY' environment variable

(Did I mention that your friends can simply grab your GossipHunter's GitHub repo URL—once you have saved your project—and open it in Sigma right away, and deploy it on their own AWS account? Oh yeah, it's that easy!)

Cool! Okay, back to business.

Before we forget it, let's append process.env.KEY to our NewsAPI URL:

  request.get(`https://newsapi.org/v2/top-headlines?sources=entertainment-weekly&apiKey=${process.env.KEY}`,

And extract out the gossips list, with a few sanity checks:

  (error, response, body) => {
    let result = JSON.parse(body);
    if (result.status !== "ok") {
      return callback('NewsAPI call failed!');
    }
    result.articles.forEach(article => {

    });

    callback(null,'Successfully executed');
  })

Sifting out the not-so-hot

Now the tricky part: we have to compare these with the most recent gossips that we have dispatched, to detect whether they are truly "new" ones, i.e. filter the ones that have not already been dispatched.

For starters, we shall maintain a DynamoDB table gossips to retain the gossips that we have dispatched, serving as our GossipHunter's "memory". Whenever a "new" gossip (i.e. one that is not already available in our table) is encountered, we shall send it out via SNS, the Simple Notification Service and add it to our table so that we will not send it out again. (Later on we can improve our "memory" to "forget" (delete) old entries so that it would not keep on growing indefinitely, but for the moment, let's not worry about it.)

What's that, Dynamo-DB?

For the DynamoDB table, simply drag a DynamoDB entry from the resources pane into the editor, right into the forEach callback. Sigma will show you a pop-up where you can define your table (without a round trip to the DynamoDB dashboard!) and the operation you intend to perform on it. Right now we need to query the table for the gossip in the current iteration, so we can zip it by

  • entering gossips into the Table Name field and url for the Partition Key,
  • selecting the Get Document operation, and
  • entering @{article.url} (note the familiar, ${}-like syntax?) in the Partition Key field.

Your brand new DynamoDB table 'gossips' with a 'Get Document' operation

      result.articles.forEach(article => {
        ddb.get({
          TableName: 'gossips',
          Key: { 'url': article.url }
        }, function (err, data) {
          if (err) {
            //handle error
          } else {
            //your logic goes here
          }
        });

      });

In the callback, let's check if DynamoDB found a match (ignoring any failed queries):

        }, function (err, data) {
          if (err) {
            console.log(`Failed to check for ${article.url}`, err);
          } else {
            if (data.Item) {  // match found, meaning we have already saved it
              console.log(`Gossip already dispatched: ${article.url}`);
            } else {

            }
          }
        });

Compose (160 characters remaining)

In the nested else block (when we cannot find a matching gossip), we prepare an SMS-friendly gossip text (including the title, and optionally the description and URL if we can stuff them in; remember the 160-character limit?). (Later you can tidy things up by throwing in a URL-shortener logic and so on, but for the sake of simplicity, I'll pass.)

            } else {
              let titleLen = article.title.length;
              let descrLen = article.description.length;
              let urlLen = article.url.length;

              let gossipText = article.title;
              if (gossipText.length + descrLen < 160) {
                gossipText += "\n" + article.description;
              }
              if (gossipText.length + urlLen < 160) {
                gossipText += "\n" + article.url;
              }

Hitting "Send"

Now we can send out our gossip as an SNS SMS. For this,

  • drag an SNS entry from the left pane into the editor, right after the last if block,
  • select Direct SMS as the Resource Type,
  • enter your mobile number into the Mobile Number field,
  • populate the SMS text field with @{gossipText},
  • type in GossipHuntr as the Sender ID (unfortunately the sender ID cannot be longer than 11 characters, but it doesn't really matter since it is just the text message sender's name; besides, GossipHuntr is more catchy, right? :)), and
  • click Inject.

But...

Wait! What would happen if your best buddy grabs your repo and deploys it; his gossips would also start flowing into your phone!

Perhaps a clever trick would be to extract out the phone number into another environment variable, so that you and your best buddy can pick your own numbers (and part ways, still as friends) at deployment time. So click the (x) again and add a new PHONE variable (with your phone number), and use it in the Mobile Number field instead as (you guessed it!) @{process.env.PHONE}:

Behold: gossip SMSs are on their way!

            } else {
              let titleLen = article.title.length;
              let descrLen = article.description.length;
              let urlLen = article.url.length;

              let gossipText = article.title;
              if (gossipText.length + descrLen < 160) {
                gossipText += "\n" + article.description;
              }
              if (gossipText.length + urlLen < 160) {
                gossipText += "\n" + article.url;
              }

              sns.publish({
                Message: gossipText,
                MessageAttributes: {
                  'AWS.SNS.SMS.SMSType': {
                    DataType: 'String',
                    StringValue: 'Promotional'
                  },
                  'AWS.SNS.SMS.SenderID': {
                    DataType: 'String',
                    StringValue: 'GossipHuntr'
                  },
                },
                PhoneNumber: process.env.PHONE
              }).promise()
                .then(data => {
                  // your code goes here
                })
                .catch(err => {
                  // error handling goes here
                });
            }

(In case you got overexcited and clicked Inject before reading the but... part, chill out! Dive right into the code, and change the PhoneNumber parameter under the sns.publish(...) call; ta da!)

Tick it off, and be done with it!

One last thing: for this whole contraption to work properly, we also need to save the "new" gossip in our table. Since you have already defined the table during the query operation, you can simply drag it from under the DynamoDB list on the resources pane (click the down arrow on the DynamoDB entry to see the table definition entry); drop it right under the SNS SDK call, select Put Document as the operation, and configure the new entry as url = ${article.url} (by clicking the Add button under Values and entering url as the key and @{article.url} as the value).

Dragging the existing DynamoDB table in; for our last mission

Adding a 'sent' marker for the 'hot' gossip that we just texted out

                .then(data => {
                  ddb.put({
                    TableName: 'gossips',
                    Item: { 'url': article.url }
                  }, function (err, data) {
                    if (err) {
                      console.log(`Failed to save marker for ${article.url}`, err);
                    } else {
                      console.log(`Saved marker for ${article.url}`);
                    }
                  });
                })
                .catch(err => {
                  console.log(`Failed to dispatch SMS for ${article.url}`, err);
                });

Time to polish it up!

Since we'd be committing this code to GitHub, let's clean it up a bit (all your buddies would see this, remember?) and throw in some comments:

let AWS = require('aws-sdk');
const sns = new AWS.SNS();
const ddb = new AWS.DynamoDB.DocumentClient();
let request = require('request');

exports.handler = function (event, context, callback) {

  // fetch the latest headlines
  request.get(`https://newsapi.org/v2/top-headlines?sources=entertainment-weekly&apiKey=${process.env.KEY}`,
    (error, response, body) => {

      // early exit on failure
      let result = JSON.parse(body);
      if (result.status !== "ok") {
        return callback('NewsAPI call failed!');
      }

      // check each article, processing if it hasn't been already
      result.articles.forEach(article => {
        ddb.get({
          TableName: 'gossips',
          Key: { 'url': article.url }
        }, function (err, data) {
          if (err) {
            console.log(`Failed to check for ${article.url}`, err);
          } else {
            if (data.Item) {  // we've seen this previously; ignore it
              console.log(`Gossip already dispatched: ${article.url}`);

            } else {
              let titleLen = article.title.length;
              let descrLen = article.description.length;
              let urlLen = article.url.length;

              // stuff as much content into the text as possible
              let gossipText = article.title;
              if (gossipText.length + descrLen < 160) {
                gossipText += "\n" + article.description;
              }
              if (gossipText.length + urlLen < 160) {
                gossipText += "\n" + article.url;
              }

              // send out the SMS
              sns.publish({
                Message: gossipText,
                MessageAttributes: {
                  'AWS.SNS.SMS.SMSType': {
                    DataType: 'String',
                    StringValue: 'Promotional'
                  },
                  'AWS.SNS.SMS.SenderID': {
                    DataType: 'String',
                    StringValue: 'GossipHuntr'
                  },
                },
                PhoneNumber: process.env.PHONE
              }).promise()
                .then(data => {
                  // save the URL so we won't send this out again
                  ddb.put({
                    TableName: 'gossips',
                    Item: { 'url': article.url }
                  }, function (err, data) {
                    if (err) {
                      console.log(`Failed to save marker for ${article.url}`, err);
                    } else {
                      console.log(`Saved marker for ${article.url}`);
                    }
                  });
                })
                .catch(err => {
                  console.log(`Failed to dispatch SMS for ${article.url}`, err);
                });
            }
          }
        });
      });

      // notify AWS that we're good (no need to track/notify errors at the moment)
      callback(null, 'Successfully executed');
    })
}

All done!

3, 2, 1, ignition!

Click Deploy on the toolbar, which will set a chain of actions in motion: first the project will be saved (committed to your own GitHub repo, with a commit message of your choosing), then built and packaged (fully automated!) and finally deployed into your AWS account (giving you a chance to review the deployment summary before it is executed).

deployment progress

Once the progress bar hits the end and the deployment status says CREATE_COMPLETE (or UPDATE_COMPLETE in case you missed a spot and had to redeploy), GossipHunter is ready for action!

Houston, we're GO!

Until your DynamoDB table is primed up (populated with enough gossips to start waiting for updates), you would receive a trail of gossip texts. After that, whenever a new gossip comes up, you will receive it on your mobile within a matter of minutes!

All thanks to the awesomeness of serverless and AWS, and Sigma that brings it all right into your web browser.

Sunday, April 22, 2018

Deploying your stuff with Google Cloud Deployment Manager: via NodeJS

This may not be the correct way; heck, this may be the crappiest way. I'm putting this up because I could not find a single decent sample on how to do it with JS.

The approach in this post uses NodeJS (server-side), but it is possible to do the same on the client side by loading the Google API client and subsequently the deploymentmanager v2 module; I'll write about it as well, if/when I get a chance.

First you set up authentication so your googleapis client can obtain a token automatically.

Assuming that you have added the googleapis:28.0.1 NPM module to your dependencies list, and downloaded your service account key into the current directory (where the deploymentmanager-invoking code is residing):

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

I used a service account, so YMMV.

If you like, you can cache the token at dev time by adding some more gimmicks: I used axios-debug-log to intercept the auth response and persist the token to a local file, from which I read the token during subsequent runs (if the token expires the JWT client will automatically refresh it, which I will then persist):

process.env.log = "axios";
const tokenFile = "./token.json";
require("axios-debug-log")({
    // disable extra logging
    request: function (debug, config) {},
    error: function (debug, error) {},
    response: function (debug, response) {
        // grab and save access token for reuse
        if (response.data.access_token) {
            console.log("Updating token");
            require("fs").writeFile(tokenFile, JSON.stringify(response.data));
        }
    },
});

// load saved token; if success, use OAuth2 client with loaded token instead of JWT client
// (avoid re-auth at each run)
try {
    const token = require(tokenFile);
    if (!token.access_token) {
        throw Error("no token found");
    }
    token.refresh_token = token.refresh_token || "refreshtoken";    //mocking
    console.log("Using saved tokens from", tokenFile);
    jwtClient.setCredentials(token);
} catch (e) {
    console.log(e.message);
}

Fair enough. Now to get the current state of the deployment:

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

const deployments = google.deploymentmanager("v2").deployments;

let fingerprint = null;

let deployment = deployments.get({
    project: projectId,
    deployment: deployment
})
    .then(response => {
        fingerprint = response.data.fingerprint;
        console.log("Fingerprint", fingerprint);
        return Promise.resolve(response);
    })
    .then(response => {
        // continue the logic
    });

The "fingerprint logic" is needed because we need to pass a "fingerprint" to every "write" (update (preview/start), stop, cancelPreview etc.) operation in order to guarantee in-order execution and operation synchronization.

That done, we set up an update for our deployment by creating a deployment preview (shell) within the last .then():

    .then(response => {
        console.log("Creating deployment preview", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                /* your resource definitions here; e.g.

                                {
                                    name: "myGcsBucket",
                                    type: "storage.v1.bucket",
                                    properties: {
                                        storageClass: "STANDARD",
                                        location: "US",
                                        labels: {
                                            "keyOne": "valueOne"
                                        }
                                    }
                                }
                                
                                and so on */
                            ]
                        }, 4, 2)
                    }
                }
            }
        })
            .catch(e => err("Failed to preview deployment", e))
    })

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

Notice that we passed fingerprint as part of the payload. Without it, Google would complain that it expected one.

But now, we again need to call deployments.get() because the fingerprint would have been updated! (Why the heck doesn't Google return the fingerprint in the response itself?!)

Maybe it's easier to just wrap the modification calls inside a utility code snippet:

const filter = {
    project: projectId,
    deployment: deployment
};

const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

// ...

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            // same payload from previous code block
        }))
            .catch(e => err("Failed to preview deployment", e))
    })

True, it's nasty to have a global fingerprint variable. You can pick your own way.

Meanwhile, if the initial deployments.get() fails due to a deployment being not found by the given name, we can create one (along with a preview) right away:

    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

Now let's keep on "monitoring" the preview until it reaches a stable state (DONE, CANCELLED etc.):

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

let timer = {
    handle: null
};
preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

            })
    }, timer, 5000);
});

And check if we reached a terminal (completion) state:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

// ...

            .then(response => {
                // ...

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;
                }
                return true;

If we reach a success state, we can commence the actual deployment:

                        // ...
                        } else {
                            deploy();
                        }

// ...

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            fingerprint: fingerprint,
            resource: {
                name: deployment
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

And start monitoring again, until we reach a completion state:

    // ...

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

Recap:

const SUCCESS_STATES = ["SUCCESS", "DONE"];
const FAILURE_STATES = ["FAILURE", "CANCELLED"];
const COMPLETE_STATES = SUCCESS_STATES.concat(FAILURE_STATES);

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

const projectId = "your-gcp-project-id";
const deployment = "your-deployment-name";

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

// small utilty function for one-line throws

const err = (msg, e) => {
    console.log(`${msg}: ${e}`);
    throw e;
};

let timer = {
    handle: null
};

const deployments = google.deploymentmanager("v2").deployments;

const filter = {
    project: projectId,
    deployment: deployment
};

let fingerprint = null;
const ensureFingerprint = promise =>
    promise
        .then(response => deployments.get(filter))
        .then(response => {
            fingerprint = response.data.fingerprint;
            console.log("Fingerprint", fingerprint);
            return Promise.resolve(response);
        });

let preview = ensureFingerprint(Promise.resolve(null))   // only obtain the fingerprint
    .then(response => {
        console.log("Creating deployment preview", deployment);
        return ensureFingerprint(deployments.update({
            project: projectId,
            deployment: deployment,
            preview: true,
            resource: {
                name: deployment,
                fingerprint: fingerprint,
                target: {
                    config: {
                        content: JSON.stringify({
                            resources: [
                                // your resource definitions here
                            ]
                        }, 4, 2)
                    }
                }
            }
        }))
            .catch(e => err("Failed to preview deployment", e))
    })
    .catch(e => {
        // fail unless the error is a 'not found' error
        if (e.code === 404) {
            console.log("Deployment", deployment, "not found, creating");
            return ensureFingerprint(deployments.insert({
                // identical to deployments.create(), except for missing fingerprint
                project: projectId,
                deployment: deployment,
                preview: true,
                resource: {
                    name: deployment,
                    target: {
                        config: {
                            content: JSON.stringify({
                                resources: [
                                    // your resource definitions here
                                ]
                            }, 4, 2)
                        }
                    }
                }
            }))
                .catch(e => err("Deployment creation failed", e));
        } else {
            err("Unknown failure in previewing deployment", e);
        }
    });

preview.then(response => {
    console.log("Starting preview monitor", deployment);
    startTimer(() => {
        return deployments.get(filter)
            .catch(e => {
                //TODO detect and ignore temporary failures
                err("Unexpected error in monitoring preview", e);
            })
            .then(response => {
                let op = response.data.operation;
                let status = op.status;
                console.log(status, "at", op.progress, "%");

                if (COMPLETE_STATES.includes(status)) {
                    console.log("Preview completed with status", status);
                    if (SUCCESS_STATES.includes(status)) {
                        if (op.error) {
                            console.error("Errors:", op.error);
                        } else {
                            deploy();
                        }
                    } else if (FAILURE_STATES.includes(status)) {
                        console.log("Preview failed, skipping deployment");
                    }
                    return false;  // stop
                }
                return true;  // continue
            })
    }, timer, 5000);
});

const deploy = () => {
    let deployer = () => {
        console.log("Starting deployment", deployment);
        return deployments.update({
            project: projectId,
            deployment: deployment,
            preview: false,
            resource: {
                name: deployment,
                fingerprint: fingerprint
            }
        })
            .catch(e => err("Deployment startup failed", e))
    };

    deployer().then(response => {
        console.log("Starting deployment monitor", deployment);
        startTimer(() => {
            return deployments.get(filter)
                .catch(e => {
                    //TODO detect and ignore temporary failures
                    err("Unexpected error in monitoring deployment", e);
                })
                .then(response => {
                    let op = response.data.operation;
                    let status = op.status;
                    console.log(status, "at", op.progress, "%");

                    if (COMPLETE_STATES.includes(status)) {
                        console.log("Deployment completed with status", status);
                        if (op.error) {
                            console.error("Errors:", op.error);
                        }
                        return false;  // stop
                    }
                    return true;  // continue
                })
        }, timer, 5000);
    });
};

That should be enough to get you going.

Good luck!

Serverless is the new Build Server: Google CloudBuild (Container Builder) via NodeJS

Google's CloudBuild (a.k.a. "Container Builder") is an on-demand, container-based build service offered under the Google Cloud Platform (GCP). For you and me, it is a nice alternative to maintaining and paying for our own build server, and a clever addition to anyone's CI stack.

CloudBuild allows you to start from a source (e.g. a Google Cloud Source repo, a GCS bucket - or perhaps even nothing (a blank directory; "scratch"), incrementally apply several Docker container runs upon it, and publish the final result to a desired location: like a Docker repository or a GCS bucket.

With its wide variety of custom builders, CloudBuild can do almost anything - that is, as far as I see so far, anything that can be achieved by a Docker container and a volume mount can be fulfilled in CloudBuild as well. To our great satisfaction, this includes fetching sources from GitHub/BitBucket repos (in addition to the native source location options), running custom commands like zip, and much more!

Above all this (how nice of GCP!), CloudBuild gives you 2 whole hours (120 minutes) of build time per day, for free - in comparison to the comparable CodeBuild service of AWS, which offers just 1 hour and 40 minutes per month!

So, now let's have a look at how we can run a CloudBuild via JS (server-side NodeJS):

First things first: adding googleapis:28.0.1 to our dependency list;

{
  "dependencies": {
    "googleapis": "28.0.1"
  }
}

Don't forget the npm install!

In our logic flow, first we need to get ourselves authenticated; with the google-auth-library module that comes with googleapis, this is quite straightforward because the client can be fed with a JWT auth client right from the beginning, which will handle all the auth stuff behind the scenes:

const projectId = "my-gcp-project-id";

const google = require("googleapis").google;

const key = require("./keys.json");
const jwtClient = new google.auth.JWT({
    email: key.client_email,
    key: key.private_key,
    scopes: ["https://www.googleapis.com/auth/cloud-platform"]
});
google.options({auth: jwtClient});

Note that, for the above code to work verbatim, you need to place a service account key file in the current directory (usually obtained by creating a new service account via the Google Cloud console, in case you don't already have one).

Now we can simply retrieve the v1 version of the cloudbuild client from google, and start our magic:

const builds = google.cloudbuild("v1").projects.builds;

First we submit a build "spec" to the CloudBuild service. Below is an example for a typical NodeJS module on GitHub:

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "https://github.com/slappforge/slappforge-sdk", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "slappforge-sdk.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "slappforge-sdk.zip",
                    "gs://sdk-archives/slappforge-sdk/$BUILD_ID/slappforge-sdk.zip"
                ]
            }
        ]
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Basically we retrieve the source from GitHub, fetch the dependencies via a npm install, bundle the whole thing using a zip command container (took me a while to figure it out, which is why I'm posting this!) and upload the resulting zip to a GCS bucket.

We can tidy this up a bit (and perhaps make the template reusable for subsequent builds, by extracting out the parameters into a substitutions section:

const repoUrl = "https://github.com/slappforge/slappforge-sdk";
const projectName = "slappforge-sdk";
const bucket = "sdk-archives";

builds.create({
    projectId: projectId,
    resource: {
        steps: [
            {
                name: "gcr.io/cloud-builders/git",
                args: ["clone", "$_REPO_URL", "."]
            },
            {
                name: "gcr.io/cloud-builders/npm",
                args: ["install"]
            },
            {
                name: "kramos/alpine-zip",
                args: [
                    "-q",
                    "-x", "package.json", ".git/", ".git/**", "README.md",
                    "-r",
                    "$_PROJECT_NAME.zip",
                    "."
                ]
            },
            {
                name: "gcr.io/cloud-builders/gsutil",
                args: [
                    "cp",
                    "$_PROJECT_NAME.zip",
                    "gs://$_BUCKET_NAME/$_PROJECT_NAME/$BUILD_ID/$_PROJECT_NAME.zip"
                ]
            }
        ],
        substitutions: {
            _REPO_URL: repoUrl,
            _PROJECT_NAME: projectName,
            _BUCKET_NAME: bucket
        }
    }
})
    .catch(e => {
        throw Error("Failed to start build: " + e);
    })

Once the build is started, we can monitor it like so (with a few, somewhat neat wrappers to properly manage the timer logic):

    .then(response => {
        let timer = {
            handle: null
        };

        startTimer(() => {
            return builds.get({
                projectId: projectId,
                id: response.data.metadata.build.id
            })
                .catch(e => {
                    throw e;
                })
                .then(response => {
                    const COMPLETE_STATES = ["SUCCESS", "DONE", "FAILURE", "CANCELLED"];
                    if (COMPLETE_STATES.includes(response.data.status)) {
                        return false;
                    }
                    return true;
                })
        }, timer, 5000);
    });

// small utility to run a timer task without multiple concurrent requests

const startTimer = (func, timer, period) => {
    let caller = () => {
        func().then(repeat => {
            if (repeat) {
                timer.handle = setTimeout(caller, period);
            }
        });
    };
    timer.handle = setTimeout(caller, period);
};

Once the build reaches a steady state, you are done!

If you want fine-grained details, just dig into response.data within the timer callback blocks.

Happy CloudBuilding!

Friday, April 20, 2018

Serverless: a no-brainer!

Few years ago, containers swept through the dev and devops lands like a category-6 hurricane.

Docker. Rkt. others.

Docker Swarm.

K8s.

OpenShift.

Right now we are literally at the epicenter, but when we glimpse at the horizon we see another one coming!

Serverless.

The funny thing is, "serverless" itself is a misnomer.

Of course there are servers. There are always servers. How can programs execute themselves in thin air, without the support of the underlying hardware and utility modules? So, there are servers.

Just not where you would expect them to be.

Traversing the timeline of computing, we see the turbulent track record?? of servers: birth in secret dungeons of vaccum tubes and city-scale power supplies; multi-ton boxes; networks; clusters; cloud datacenters and server farms (agriculture just lost its royalty!); containers.

Over time, we see servers losing their significance. Gradually, but steadily.

And now, suddenly, puff! They are gone.

Invisible, to be precise.

With serverless, you no longer care about the server. It may be a physical machine, a cloud VM, a K8s pod, an ECS container... heck, even an IoT rig.

Nobody cares, as long as the job gets done.

In this sense, we realize that serverless is nothing new; the concept, and even some pracical implementations, have been there since as far back as 2006. You yourself may have benefited from serverless (or conceptually serverless) architectures; while one may argue them to be PaaSes, Google App Engine and Google Apps Script (especially) are good examples from my Google-ridden "fungramming" history.

Just like touchscreens, serverless resemblances had always been there, but never has the marketing hype been this intense - obviously it is growing, and we'll surely see more of it as time flies by.

AWS had an early entry to the arena and currently owns a huge market share, bigger than all the others combined; Azure is behind, but catching up fast; and Google still seems to be more focused on Kubernetes and related containerization stuff although they too are on the track with Cloud Functions and Firebase.

Streaming and event-driven architectures are playing their part in bringing value to serverless. We should also not forget the cloud hype that made people go every-friggin-thing-as-a-service and later left them wondering how they could pay only for what they really use, only while they use it.

All ramblings aside, serverless is growing in popularity. Platforms are evolving to support more event sources, better integration support for other services and richer monitoring and statistics. Frameworks like Serverless are striving to provide a unified and generifier serverless development experience while IDEs like Sigma are doing their part in helping newbies (and sometimes even professionals) get going with serverless with minimum hassle and maximum speed.

Being new and shiny does not necessarily mean that serverless is the silver bullet for all your dev issues; in fact, right now it fits into only a few enterprise use cases (primarily due to the lack of strong guarantees, which are quite commonplace in the bureaucratic enterprise atmosphere). Nevertheless, providers are already working on this, and we can expect some disruptive - if not revolutionary - changes in the not-too-distant future. However it is always best to re-iterate your requirements before officially stepping into the serverless world, because serverless demands quite a shift in your application architecture, devops, as well as the very core of your developer mindset.

And, of course, the best way to pick the cake is to taste it yourself.

Sigma the Serverless IDE: resources, triggers, and heck, operations

With serverless, you stopped caring about the server.

With Sigma, you stopped (or will stop, if not already) about the platform.

Now all you care about is your code - the bliss of every programmer.

Or is it?

I hold her (the code) in my arms.

If you have done time with serverless frameworks, you would already know how they take away your platform-phobia, abstracting out the platform-specific bits of your serverless app.

And if you have already tried out Sigma, you would have noticed how it takes things further, relieving you of the burden of the configuration and deployment aspects as well.

Sigma for a healthier dev life!

Leaving behind just the code.

Just the beautiful, raw code.

So, what's the catch?

Okay. Now for the untold, unspoken, not-so-popular part.

You see, my friend, every good thing comes at a price.

Lucky for you, with Sigma, the price is very affordable.

Just a matter of sticking to a few ground rules while you develop your app. That's all.

Resources, resources.

All of Sigma's voodoo depends on one key thing: resources.

Resources, resources!

The concept is quite simple: every piece of your serverless app - may it be a DynamoDB table, S3 bucket or SNS topic - is a resource from Sigma's point of view.

If you remember the Sigma UI, the Resources pane on the left contains different resource types that you can have in your serverless app. (True, it's pretty short; but we're working on it :))

Resources pane in Sigma UI

Behind the scenes

When you drag a resource from this pane, into your code, Sigma secretly creates a resource (which it would later deploy into your serverless provider) to track the configurations of the actual service entity (say, the S3 bucket that should exist in AWS by the time your function is running) and all its usages within your code. The tracking is fully automated; frankly, you didn't even want to know about that.

Sigma tracks resources in your app, and deploys them into the underlying platform!

"New" or "existing"?

On almost all of Sigma's resource configuration pop-ups, you may have noticed two options: "new" vs "existing". "New" resources are the ones that would be (or have already been) created as a result of your project, whereas "existing" ones are those which have been created outside of your project.

Now that's a tad bit strange because we would usually use "existing" to denote things that "exist", regardless of their origin - even if they came from Mars.

Better brace yourself, because this also gives rise to a weirder notion: once you have deployed your project, the created resources (which now "exist" in your account) are still treated by Sigma as "new" resources!

And, as if that wasn't enough, this makes the resources lists in Sigma behave in totally unexpected ways; after you define a "new" resource, whenever you want to reuse that resource somewhere else, you would have to look for it under the "existing" tab of the resource pop-up; but it will be marked with a " (new)" prefix because, although it is already defined, it remains "new" from Sigma's point of view.

Now, how sick is that?!

Bang head here.

Perhaps we should have called them "Sigma" resources; or perhaps even better, "project" resources; while we scratch our heads, feel free to chip in and help us with a better name!

Rule o' thumb

Until this awkwardness is settled, the easiest way to get through this mess is to stick to this rule of thumb:


If you added a resource to your current Sigma project, Sigma would treat it as a "new" resource till the end of eternity.


Bottom line: no worries!

Being able to use existing resources is sometimes cool, but it means that your project would be much less portable. Sigma will always assume that the resources referenced by your project are already in existence, regardless of whatever AWS account you attempt to deploy it. At least until (if) we (ever) come up with a different resource management mechanism.


If you want portability, always stick to new resources. That way, even if a complete stranger gets hold of your project and deploys it in his own, alien, unheard-of AWS account, the project would still work.

If you are integrating with an already existing set of resources (e.g. the set of S3 bucket in your already-running dev/test/prod environment), using existing resources is the obvious (and the most convenient) choice.


Anyways, back to our discussion:

Where were we?

Ah, yes. Resources.

The secret life of resources

In a serverless app, you basically use resources for two things:

  • for triggering the app (as an event source, a.k.a. trigger)
  • for performing work inside the app, such as invoking external services

triggers and operations

Resources? Triggers?? Operations???

Sigma also associates its resources with your serverless app in a similar fashion:

In Sigma, a function can have several triggers (as long as the application itself is aware of tackling different trigger event types!), and can contain several operations (obviously).

Yet, they're different.

It is noteworthy that a resource itself is not a trigger or an operation; triggers and operations are associated with resources (they kind of "bridge" functions and resources) but a resource has its own independent life. As a result, a resource can power many triggers (to be precise, zero or more) and get involved in many operations, across many (again, zero or more) functions.

A good example is S3. If you want to write an image resizer function that would pick and process images dropped into a S3 bucket, you would configure a S3 trigger to invoke the function upon the file drop, and a S3 GetObject operation to retrieve and process the file; however, both will point to the same S3 resource, namely the bucket where images are being dropped into and fetched from.

Launch time!

At deployment, Sigma will take care of putting the pieces together - trigger configs, runtime permissions and whatnot - based on which function is associated with which resources, and in which ways (trigger-mode vs operation-mode). You can simply drag, drop and configure your buckets, queues and stuff, write your code, and totally forget about the rest!

That's the beauty of Sigma.

When a resource is "abandoned" (meaning that it is not used in any trigger or operation), it shows up in the "unused resources" list (remember the dustbin button on the toolbar?) and can be removed from the project; remember that if you do this, provided that the resource is a "new" one (rule of thumb: one created in Sigma), it will be automatically removed from your serverless provider account (for example, AWS) during your next deployment!

So there!

if Sigma's resource model (the whole purpose of this article) looks like a total mess-up to you, feel free to raise your voice on StackOverflow - or better still, our GitHub space, FB page or Twitter feed; we would appreciate it very much!

Of course, Sigma has nothing to hide; if you check your AWS account after a few Sigma deployments, you would realize the things we have been doing under the hood.

All of it, to make your serverless journey as smooth as possible.

And easy.

And fun. :)

Welcome to the world of Sigma!

Thursday, April 19, 2018

Sigma QuickBuild: Towards a Faster Serverless IDE

TL;DR

The QuickBuild/QuickDeploy feature described here is pretty much obsoleted by the test framework (ingeniously hacked together by @CWidanage), that gives you a much more streamlined dev-test experience with much better response time!


In case you hadn't noticed, we have recently been chanting about a new Serverless IDE, the mighty SLAppForge Sigma.

With Sigma, developing a serverless app becomes as easy as drag-drop, code, and one-click-Deploy; no getting lost among overcomplicated dashboards, no eternal struggles with service entities and their permissions, no sailing through oceans of docs and tutorials - above all that, nothing to install (just a web browser - which you already have!).

So, how does Sigma do it all?

In case you already tried Sigma and dug a bit deeper than just deploying an app, you may have noticed that it uses AWS CodeBuild under the hood for the build phase. While CodeBuild gives us a fairly simple and convenient way of configuring and running builds, it has its own set of perks:

  • CodeBuild takes a significant time to complete (sometimes close to a minute). This may not be a problem if you just deploy a few sample apps, but it can severely impair your productivity - especially when you begin developing your own solution, and need to reflect your code updates every time you make a change.
  • The AWS Free Tier only includes 100 minutes of CodeBuild time per month. While this sounds like a generous amount, it can expire much faster than you think - especially when developing your own app, in your usual trial-and-error cycles ;) True, CodeBuild doesn't cost much either ($0.005 per minute of build.general1.small), but why not go free while you can? :)

Options, people?

Lambda, on the other hand, has a rather impressive free quota of 1 million executions and 3.2 million seconds of execution time per month. Moreover, traffic between S3 and Lambda is free as far as we are concerned!

Oh, and S3 has a free quota of 20000 reads and 2000 writes per month - which, with some optimizations on the reads, is quite sufficient for what we are about to do.

2 + 2 = ...

So, guess what we are about to do?

Yup, we're going to update our Lambda source artifacts in S3, via Lambda itself, instead of CodeBuild!

Of course, replicating the full CodeBuild functionality via a lambda would need a fair deal of effort, but we can get away with a much simpler subset; read on!

The Big Picture

First, let's see what Sigma does when it builds a project:

  • prepare the infra for the build, such as a role and an S3 bucket, skipping any that already exist
  • create a CodeBuild project (or, if one already exists, update it to match the latest Sigma project spec)
  • invoke the project, which will:
    • download the Sigma project source from your GitHub repo,
    • run an npm install to populate its dependencies,
    • package everything into a zip file, and
    • upload the zip artifact to the S3 bucket created above
  • monitor the project progress, and retrieve the URL of the uploaded S3 file when done.

And usually every build has to be followed by a deployment; to update the lambdas of the project to point to the newly generated source archive; and that means a whole load of additional steps!

  • create a CloudFormation stack (if one does not exist)
  • create a changeset that contains the latest updates to be published
  • execute the changeset, which will, at the least, have to:
    • update each of the lambdas in the project to point to the new source zip file generated by the build, and
    • in some cases, update the triggers associated with the modified lambdas as well
  • monitor the stack progress until it gets through with the update.

All in all, well over 60-90 seconds of your precious time - all to accommodate perhaps just one line (or how about one word, or one letter?) of change!

Can we do better?

At first glance, we see quite a few redundancies and possible improvements:

  • Cloning the whole project source from scratch is overkill, especially when only a few lines/files have changed.
  • Every build will download and populate the NPM dependencies from scratch, consuming bandwidth, CPU cycles and build time.
  • The whole zip file is now being prepared from scratch after each build.
  • Since we're still in dev, running a costly CF update for every single code change doesn't make much sense.

But since CodeBuild invocations are stateless and CloudFormation's resource update logic is mostly out of our hands, we don't have the freedom to meddle with many of the above; other than simple improvements like enabling dependency caching.

Trimming down the fat

However, if we have a lambda, we have full control over how we can simplify the build!

If we think about 80% - or maybe even 90% - of the cases for running a build, we see that they merely involve changes to application logic (code); you don't add new dependencies, move your files around or change your repo URL all the time, but you sure as heck would go through an awful lot of code edits until your code starts behaving as you expect it to!

And what does this mean for our build?

80% - or even 90% - of the time, we can get away by updating just the modified files in the lambda source zip, and updating the lambda functions themselves to point to the updated file!

Behold, here comes QuickDeploy!

And that's exactly what we do, with the QuickBuild/QuickDeploy feature!

Lambda to the rescue!

QuickBuild uses a lambda (deployed in your own account, to eliminate the need for cross-account resource access) to:

  • fetch the latest CodeBuild zip artifact from S3,
  • patch the zip file to accommodate the latest code-level changes, and
  • upload the updated file back to S3, overriding the original zip artifact

Once this is done, we can run a QuickDeploy which simply sends an UpdateFunctionCode Lambda API call to each of the affected lambda functions in your project, so that they can scoop up the latest and greatest of your serverless code!

And the whole thing does not take more than 15 seconds (give or take the network delays): a raw 4x improvement in your serverless dev workflow!

A sneak peek

First of all, we need a lambda that can modify an S3-hosted zip file based on a given set of input files. While it's easy to make with NodeJS, it's even easier with Python, and requires zero external dependencies as well:

Here we go... Pythonic!

import boto3

from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED

s3_client = boto3.client('s3')

def handler(event, context):
  src = event["src"]
  if src.find("s3://") > -1:
    src = src[5:]
  
  bucket, key = src.split("/", 1)
  src_name = "/tmp/" + key[(key.rfind("/") + 1):]
  dst_name = src_name + "_modified"
  
  s3_client.download_file(bucket, key, src_name)
  zin = ZipFile(src_name, 'r')
  
  diff = event["changes"]
  zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
  
  added = 0
  modified = 0
  
  # files that already exist in the archive
  for info in zin.infolist():
    name = info.filename
    if (name in diff):
      modified += 1
      zout.writestr(info, diff.pop(name))
    else:
      zout.writestr(info, zin.read(info))
  
  # files in the diff, that are not on the archive
  # (i.e. newly added files)
  for name in diff:
    info = ZipInfo(name)
    info.external_attr = 0755 << 16L
    added += 1
    zout.writestr(info, diff[name])
  
  zout.close()
  zin.close()
  
  s3_client.upload_file(dst_name, bucket, key)
  return {
    'added': added,
    'modified': modified
  }

We can directly invoke the lambda using the Invoke API, hence we don't need to define a trigger for the function; just a role with S3 full access permissions would do. (We use full access here because we would be reading from/writing to different buckets at different times.)

CloudFormation, you beauty.

From what I see, the coolest thing about this contraption is that you can stuff it all into a single CloudFormation template (remember the lambda command shell?) that can be deployed (and undeployed) in one go:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  zipedit:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: zipedit
      Handler: index.handler
      Runtime: python2.7
      Code:
        ZipFile: >
          import boto3
          
          from zipfile import ZipFile, ZipInfo, ZIP_DEFLATED
          
          s3_client = boto3.client('s3')
          
          def handler(event, context):
            src = event["src"]
            if src.find("s3://") > -1:
              src = src[5:]
            
            bucket, key = src.split("/", 1)
            src_name = "/tmp/" + key[(key.rfind("/") + 1):]
            dst_name = src_name + "_modified"
            
            s3_client.download_file(bucket, key, src_name)
            zin = ZipFile(src_name, 'r')
            
            diff = event["changes"]
            zout = ZipFile(dst_name, 'w', ZIP_DEFLATED)
            
            added = 0
            modified = 0
            
            # files that already exist in the archive
            for info in zin.infolist():
              name = info.filename
              if (name in diff):
                modified += 1
                zout.writestr(info, diff.pop(name))
              else:
                zout.writestr(info, zin.read(info))
            
            # files in the diff, that are not on the archive
            # (i.e. newly added files)
            for name in diff:
              info = ZipInfo(name)
              info.external_attr = 0755 << 16L
              added += 1
              zout.writestr(info, diff[name])
            
            zout.close()
            zin.close()
            
            s3_client.upload_file(dst_name, bucket, key)
            return {
                'added': added,
                'modified': modified
            }
      Timeout: 60
      MemorySize: 256
      Role:
        Fn::GetAtt:
        - role
        - Arn
  role:
    Type: AWS::IAM::Role
    Properties:
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
        - Action: sts:AssumeRole
          Effect: Allow
          Principal:
            Service: lambda.amazonaws.com

Moment of truth

Once the stack is ready, we can start submitting our QuickBuild requests to the lambda!

// assuming auth stuff is already done
let lambda = new AWS.Lambda({region: "us-east-1"});

// ...

lambda.invoke({
  FunctionName: "zipedit",
  Payload: JSON.stringify({
    src: "s3://bucket/path/to/archive.zip",
    changes: {
      "path/to/file1/inside/archive": "new content of file1",
      "path/to/file2/inside/archive": "new content of file2",
      // ...
    }
  })
}, (err, data) => {
  let result = JSON.parse(data.Payload);
  let totalChanges = result.added + result.modified;
  if (totalChanges === expected_no_of_files_from_changes_list) {
    // all izz well!
  } else {
    // too bad, we missed a spot :(
  }
});

Once QuickBuild has completed updating the artifact, it's simply a matter of calling UpdateFunctionCode on the affected lambdas, with the S3 URL of the artifact:

lambda.updateFunctionCode({
  FunctionName: "original_function_name",
  S3Bucket: "bucket",
  S3Key: "path/to/archive.zip"
})
.promise()
.then(() => { /* done! */ })
.catch(err => { /* something went wrong :( */ });

(In our case the S3 URL remains unchanged (because our lambda simply overwrites the original file), but it still works because the Lambda service makes a copy of the code artifact when updating the target lambda.)

To speed up the QuickDeploy for multiple lambdas, we can even parallelize the UpdateFunctionCode calls:

Promise.all(
  lambdaNames.map(name =>
    lambda.updateFunctionCode({ /* params */ })
    .promise()
    .then(() => { /* done! */ }))

.then(() => { /* all good! */ })
.catch(err => { /* failures; handle them! */ });

And that's how we gained an initial 4x improvement in our lambda deployment cycle, sometimes even faster than the native AWS Lambda console!