Go closures

Say we need a map to store various versions of a configuration in Go. Here is a simple example of the structure: envs := map[string]string{ "dev": "1", "prod": "2", } Given this config map, we need to create an additional map that uses the same strings as the keys, but has functions for values. The catch is that the body of each function needs to make use of the value from its corresponding key.
Markdown is useful tool – these blog posts are written in it. I like Markdown because once you learn it, it feels invisible. It is minimal and intuitive. However, sometimes you need it to do things a little differently. I ran into an issue where I had content which had to be written in only Markdown (no HTML) and later needed to be rendered as HTML and inserted onto a webpage, but I needed to add attributes to the HTML tags that were generated.
Creating a Go module We’re going to create a CLI tool for sending a message to a channel in Slack using the command line. This post is similar to my earlier post: Creating an Elixir Module. We’ll be using the chat.postMessage Slack API endpoint. Also, make sure you have a Slack API token. Our CLI syntax will be: $ ./slack -message 'hello world!' -channel @slackbot First, make sure you have your $GOPATH set properly.
supervisor is a UNIX utility to managing and respawning long running Python processes to ensure they are always running. Or according to its website: Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems. Installation supervisor can be installed with pip $ pip install supervisor Given a script test_proc.py, start the process under supervisor as $ sudo supervisorctl start test_proc Now it will run forever and you can see the process running with
Querying S3 with Presto This post assumes you have an AWS account and a Presto instance (standalone or cluster) running. We’ll use the Presto CLI to run the queries against the Yelp dataset. The dataset is a JSON dump of a subset of Yelp’s data for businesses, reviews, checkins, users and tips. Configure Hive metastore Configure the Hive metastore to point at our data in S3. We are using the docker container inmobi/docker-hive
Creating a Presto Cluster I first came across Presto when researching data virtualization - the idea that all of your data can be integrated regardless of its format or storage location. One can use scripts or periodic jobs to mashup data or create regular reports from several independent sources. However, these methods don’t scale well, especially when the queries change frequently or the data is ingested in realtime. Presto allows one to query a variety of data sources using SQL and presents the data in a standard table format, where it can be manipulated and JOINed like traditional relational data.
Creating an Elixir module To get a better handle on Elixir, I developed a simple CLI tool for sending files in Slack. To create a new project, run $ mix new slack_bot This creates a new Elixir project which looks like this ├── README.md ├── config │ └── config.exs ├── lib │ └── slack_bot.ex ├── mix.exs └── slack_bot ├── slack_bot_helper.exs └── slack_bot_test.exs Navigate to the lib folder and create a folder inside it called slack_bot.

Git aliases

Here’s a quick post for managing your git shortcuts. If you use git regularly, you should have a .gitconfig file in your home directory that looks something like this: [user] email = [email protected] name = Your name You can add an alias section like so: [user] email = [email protected] name = Your name [alias] ls = log --oneline uom = push -u origin master These aliases can be used like so:
Recently, I have been working with the Python API for Spark to use distrbuted computing techniques to perform analytics at scale. When you write Spark code in Scala or Java, you can bundle your dependencies in the jar file that you submit to Spark. However, when writing Spark code in Python, dependency management becomes more difficult because each of the Spark executor nodes performing computations needs to have all of the Python dependencies installed locally.
To help facilitate my blogging workflow, I wanted to go from written to published post quickly. My general workflow for writing a post for this blog looks like this: Create a post in _posts Write the post Run fab sync Here is the repo fab sync is a custom command that uses the magic of Fabric to stage, commit and push changes in my blog repo to Github. Next, Fabric uses an ssh session in the Python process to connect to the server on which my blog is hosted, pull down the newest changes from the blog repo and finally, build the Jekyll blog so that the changes are immediately reflected on this site.