Below you will find pages that utilize the taxonomy term “code”
Posts
Migrating to Hugo
This is now my third attempt at migrating from Jekyll to Hugo. I’m writing this post before I finish the migration. Luckily, because this is my third attempt, I’m confident I will see it through to the end.
After following the Hugo quickstart guide, I imported my Jekyll posts
hugo import jekyll ../../my-blog/ . --force enabled syntax highlighting
hugo gen chromastyles --style=monokai > syntax.css and turned on the configurations for highlighting adding the following to my config.
Posts
Pipelines and your Unix toolbox
Unix commands are great for manipulating data and files. They get even better when used in shell pipelines. The following are a few of my go-tos – I’ll list the commands with an example or two. While many of the commands can be used standalone, I’ll provide examples that assume the input is piped in because that’s how you’d used these commands in a pipeline. Lastly, most of these commands are pretty simple and that is by design – the Unix philosophy focuses of simple, modular code, which can be composed to perform more complex operations.
Posts
Go and Unix files
I ran into an odd Unix filename issue while writing Go code the other day.
Here’s a simplified example:
Let’s read a json file and unmarshall its contents into a struct in go. First, let’s set an environment variable with our file name to avoid hardcoded constants in our program.
export MY_FILE="/Users/dancorin/Desktop/test.json " Now, let’s read the file into our struct:
package main import ( "encoding/json" "fmt" "io/ioutil" "os" ) // Stuff struct holds the json contents type Stuff struct { Test string `json:"test"` } func main() { stuff := Stuff{} place := os.
Posts
Debugging go code with delve
Delve is a debugger for the Go programming language. The goal of the project is to provide a simple, full featured debugging tool for Go.
If we run our go service using a Makefile, with a command like make run, it can hard to find where to hook in and call dlv debug. We can get around this issue by attaching the delve debugger to our running service instead.
Posts
Go scope
Scoping in Go is built around the notion of code blocks. You can find several good explanations of how variable scoping work in Go on Google. I’d like to highlight one slightly unintuitive consequence of Go’s block scoping if you’re used to a language like Python, keeping in mind, this example does not break with Go’s notion of block scoping:
Let’s start with a common pattern in Python:
class Data(object): def __init__(self, val): self.
Posts
Tracking a call stack in Go with context
The use of context in Go can help you pass metadata through your program with helpful, related information about a call. Let’s build an example where we set a context key called "stack" which keeps a history of the function names called over the lifetime of the context. As we pass the context object through a few layers of functions, we’ll append the name of the function to the value of the context key "stack".
Posts
Go channels
Go uses goroutines to execute multiple bits of code at the same time. Channels allow for the aggregation of the results of these concurrent calls after they have finished.
Consider a case where we want to make several GET requests to a server. The server takes some time to process each request, in many cases can handle many simultaneous connections. In a language like Python, we might do the following to make several requests:
Posts
Go closures
Say we need a map to store various versions of a configuration in Go. Here is a simple example of the structure:
envs := map[string]string{ "dev": "1", "prod": "2", } Given this config map, we need to create an additional map that uses the same strings as the keys, but has functions for values. The catch is that the body of each function needs to make use of the value from its corresponding key.
Posts
Custom Markdown rendering
Markdown is useful tool – these blog posts are written in it. I like Markdown because once you learn it, it feels invisible. It is minimal and intuitive. However, sometimes you need it to do things a little differently.
I ran into an issue where I had content which had to be written in only Markdown (no HTML) and later needed to be rendered as HTML and inserted onto a webpage, but I needed to add attributes to the HTML tags that were generated.
Posts
Creating a Go module
Creating a Go module We’re going to create a CLI tool for sending a message to a channel in Slack using the command line. This post is similar to my earlier post: Creating an Elixir Module. We’ll be using the chat.postMessage Slack API endpoint. Also, make sure you have a Slack API token.
Our CLI syntax will be:
$ ./slack -message 'hello world!' -channel @slackbot First, make sure you have your $GOPATH set properly.
Posts
Quickstart `supervisor` guide
Quickstart supervisor guide supervisor is a UNIX utility to managing and respawning long running Python processes to ensure they are always running. Or according to its website:
Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
Installation supervisor can be installed with pip
$ pip install supervisor Given a script test_proc.py, start the process under supervisor as
Posts
Querying S3 with Presto
Querying S3 with Presto This post assumes you have an AWS account and a Presto instance (standalone or cluster) running. We’ll use the Presto CLI to run the queries against the Yelp dataset. The dataset is a JSON dump of a subset of Yelp’s data for businesses, reviews, checkins, users and tips.
Configure Hive metastore Configure the Hive metastore to point at our data in S3. We are using the docker container inmobi/docker-hive
Posts
Creating a Presto Cluster
Creating a Presto Cluster I first came across Presto when researching data virtualization - the idea that all of your data can be integrated regardless of its format or storage location. One can use scripts or periodic jobs to mashup data or create regular reports from several independent sources. However, these methods don’t scale well, especially when the queries change frequently or the data is ingested in realtime. Presto allows one to query a variety of data sources using SQL and presents the data in a standard table format, where it can be manipulated and JOINed like traditional relational data.
Posts
Automatic Python virtual environments
Automatic Python virtual environments Python virtual environments are great for seperating your development environments for each project. You can start with a fresh install and dependencies for each project, which helps keep your project dependency list short and your Python path clean. I use virtualenvwrapper and this setup to make new environment creation easy, but I find myself constantly running a project only to realize that I haven’t activated the proper environment.
Posts
Creating an Elixir module
Creating an Elixir module To get a better handle on Elixir, I developed a simple CLI tool for sending files in Slack.
To create a new project, run
$ mix new slack_bot This creates a new Elixir project which looks like this
├── README.md ├── config │ └── config.exs ├── lib │ └── slack_bot.ex ├── mix.exs └── slack_bot ├── slack_bot_helper.exs └── slack_bot_test.exs Navigate to the lib folder and create a folder inside it called slack_bot.
Posts
Git aliases
Here’s a quick post for managing your git shortcuts. If you use git regularly, you should have a .gitconfig file in your home directory that looks something like this:
[user] email = [email protected] name = Your name You can add an alias section like so:
[user] email = [email protected] name = Your name [alias] ls = log --oneline uom = push -u origin master These aliases can be used like so:
Posts
PySpark dependencies
Recently, I have been working with the Python API for Spark to use distrbuted computing techniques to perform analytics at scale. When you write Spark code in Scala or Java, you can bundle your dependencies in the jar file that you submit to Spark. However, when writing Spark code in Python, dependency management becomes more difficult because each of the Spark executor nodes performing computations needs to have all of the Python dependencies installed locally.
Posts
Python Fabric
To help facilitate my blogging workflow, I wanted to go from written to published post quickly. My general workflow for writing a post for this blog looks like this:
Create a post in _posts Write the post Run fab sync Here is the repo
fab sync is a custom command that uses the magic of Fabric to stage, commit and push changes in my blog repo to Github. Next, Fabric uses an ssh session in the Python process to connect to the server on which my blog is hosted, pull down the newest changes from the blog repo and finally, build the Jekyll blog so that the changes are immediately reflected on this site.
Posts
Bash SSH host management
If you have a lot of servers to which you frequently connect, keeping track of IP addresses, pem files, and credentials can be tedious. SSH config files are great for this problem, but they don’t play well with bash. I wanted to store all of my hosts’ info in a config file but still have access to the HostNames since sometimes I just need the IP address of a server to use elsewhere.
Posts
Managing bash aliases
Bash aliases are great. Whether you use them to quickly connect to servers or just soup up the standard bash commands, they are a useful tool for eliminating repetitive tasks. I’m always adding new ones to optimize my workflow which, of course, lead to me create aliases to optimize that workflow. While there are more complete CLI alternatives for alias management like aka, I prefer two simple commands for managing my aliases, which I keep in ~/.
Posts
Elixir binary search
A few days ago, I saw a Guess my word game on the front page of Hacker News. Before spoiling the fun for myself by checking out the comments, I decided to try my hand at writing a solution in Elixir. Afterwards, I generalized the code to choose its own word from the UNIX dictionary and then “guess” it, applying a binary search based on the feedback of whether each guess was alphabetically greater or less than the word itself.
Posts
qc: quick calculator
If you spend most of your time in the command line, you don’t want to leave to do math. Qc is a script that does in-line command line math without forcing you to exit the main bash prompt as you might with a program like bc or a language interpreter.
#!/bin/bash python -c "print $1" Make the script executable with the command:
$ chmod +x qc.sh Alias it to qc by editing the .