Sunday, December 23, 2012

Node.js video

Watched the below video on node and hadn't heard the term green threads. Basically green threads are just another name for user space threads which I knew.

He also mentions coroutines.

Node.js Asynchronous i/o under the hood

I had a conversation with a friend at lunch about how node does async i/o under the covers.

We were basically arguing over whether or not node was truly single threaded or whether or not some operations launched a background thread to handle the operation. Looks like the answer is yes in some cases another thread is launched. For instance:

libeio (by Marc Lehmann) is a fantastic idea: it relies on POSIX threads to provide an asynchronous version of the POSIX API: open, close, stat, unlink, fdatasync, mknod, etc. It would be nice if UNIX kernels provided this asynchronism natively, because the overhead of using a thread for a stat() call when the inode data is already cached in memory is significant.

ZeroMQ Internal Architecture

Does Zeromq simply retry every so many milliseconds for the endpoint during disconnected operation?
Answer: Yes I believe so with exponential backoff.

Two cool things I got out of this article.

1. ØMQ's concurency model may a bit confusing at first. The reason is that we eat our own dogfood and use message passing to achieve concurrency and internal scalability. Thus, even though ØMQ is a multithreaded application you won't find mutexes, condition variables or semaphores meant to orchestrate the parallel processing. Instead, each object will live in its own thread and no other thread will ever touch it (that's why mutexes are not needed). Other threads will communicate with the object by sending it messages (called 'commands' to distinguish them for user-level ØMQ messages). Same way the object can speak to other objects — potentially running in different threads — by sending them 'commands'.

2. The requirements for messages are rather complex. The main reason for complexity is that the implementation should be very efficient for both very small and very large messages. The requirements are as follows:
For very small messages it's cheaper to copy the message than keep the shared data on the heap. These messages thus have no associated buffer and the data are stored directly in the zmq_msg_t structure — presumably on the stack. This has huge impact on performance as it almost entirely avoids need to do memory allocations/deallocations.
When using inproc transport, message should never be copied. Thus, the buffer sent in one thread, should be received in the other thread and get deallocated there.
Messages should support reference counting. Thus, if a message is published to many different TCP connections, all the sending I/O threads access the same buffer rather then copying the buffer for rach I/O thread or TCP connection.
The same trick should be accessible to user, so that he can send same physical buffer to multiple ØMQ sockets without need to copy the content.
User should be able to send buffer that was allocated by application-specific allocation mechanism without need to copy the data. This is especially important with legacy applications which allocate large amounts of data.

Considerations in Building a Large Infrastructure

Analyze your types of servers for instance Facebook had 5 types of servers:
Web page servers, database computers, data storage systems, news feed servers and something called memcache servers

Possible infrastructure I could imagine being implemented

DNS           Load balancers          Web servers                    Databases and Data Storage
round  --> 1 --------------->      group of web servers                   Session Storage instead of sticky sessions on a web server?
robin            2 --------------->      2nd group of web servers               Databases SQL or NoSQL distributed for failover and also size of data
with Finally Batch Processing and Analytics
Health Check

Good article on load balancing

DNS roundrobin is excellent to increase capacity, by distributing the load across multiple points (potentially geographically distributed). But it does not provide fail-over. You must first describe what type of failure you are trying to cover. A server failure must be covered locally using a standard IP address takeover mechanism (VRRP, CARP, ...). A switch failure is covered by resilient links on the server to two switches. A WAN link failure can be covered by a multi-link setup between you and your provider, using either a routing protocol or a layer2 solution (eg: multi-link PPP). A site failure should be covered by BGP : your IP addresses are replicated over multiple sites and you announce them to the net only where they are available.

Mechanical Turk

Farm out work to humans that cannot easily be done by a computer.

One worry is for the quality of the information, but they solve this by duplicating the work to multiple individuals and comparing their outcomes. If the outcomes don't match you can send the two outcomes to a third worker for rectification.

What is HMAC Authentication and why is it useful?

HMAC - Hash-based message authentication code

Doesn't by itself protect against replay attacks but could be extended to include a incrementing number in requests. Then if a request is
received with a number previously received the request will not be satisfied.

Python module search path

Recently a coworker came to me asking why after installing a module via pip that he couldn't import it into python.

Turns out pip was installing to a directory that wasn't in the search path aka sys.path

You can check your sys.path by saying "import sys" followed by "sys.path"

Make sure the directory pip is installing your module into is in sys.path

Here pass the "-v" to pip to find out where it installed the module like so:

sudo pip install nltk -v

Two-factor authentication For Increased Security

Two-factor authentication (TFA, T-FA or 2FA) is an approach to authentication which requires the presentation of two or more of the three authentication factors: a knowledge factor ("something the user knows"), a possession factor ("something the user has"), and an inherence factor ("something the user is").

Maybe with something like Yubikey from

Hashing Passwords Security Attacks

"Salt doesn't prevent dictionary attacks, just precalculated dictionary attacks. In particular, it protects against rainbow tables ( and also ensures that cracking one user's password doesn't automatically let you crack any user who shares that password."

Dictionary and Brute Force Attacks
Lookup Tables
Reverse Lookup Tables
Rainbow Tables

Here's how to avoid those problems:
Set-Cookie: userName=Alice; authCode=eeba95a4...
Where: authCode=HMAC(ROWID, userName + ipAddr)
When you receive this cookie, look up the user in the database, recompute/verify the authCode in the cookie, using ROWID and ip address of the request. No need to store cookies in the database.
For extra crypto points, throw a salt parameter into the mix:
Set-Cookie: userName=Alice; salt=59843...; authCode=eeba9...
Where: authCode=HMAC(ROWID, userName + ipAddr + salt)
Salt value is generated randomly for every cookie you produce. There's no need to keep it a secret.

Here is what you do need to store on the server - in order to authenticate each request.
UserId (obvious)
CookieHash (made out of userId, some secret private key and crypto randomly generated number)
SessionRenewed (useful for when to cancel someone's session eg. renew cookieHash every 10 min, otherwise log out user)
What I store in cookie is following

Sharding Ids at Instagram

Generate IDs in web application
Generate IDs through dedicated service
  Twitter Snowflake
  DB Ticket Servers
Their Solution
  Uses PostgresSQL Schemas and a time component along with their custom epoch

Writing endian-independent code in C

I came across this article after reading that 0MQ considers your message to be a binary blob so if for instance you are exchanging integers over the network between different endian machines you must make sure to convert back and forth.

I think the simple solution is to always convert integers to network byte order when leaving a machine. Then on the receiving end you convert the host byte order. Problem Solved.

Protocol Buffers

I was led to this by the following statement. "ØMQ doesn't know anything about the data you send except its size in bytes. That means you are responsible for formatting it safely so that applications can read it back. Doing this for objects and complex data types is a job for specialized libraries like Protocol Buffers."

Protocol Buffers are widely used at Google for storing and interchanging all kinds of structured information. Protocol Buffers serve as a basis for a custom remote procedure call (RPC) system that is used for nearly all inter-machine communication at Google.[3]

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.

A friend of mine mentioned why would I use this versus something like java serialization. Here are a couple good reasons to use protocol buffers versus java serialization.

ZeroMQ True Peer Connectivity (Harmony Pattern)

Since ØMQ is designed to make distributed messaging easy, people often ask how to interconnect a set of true peers (as compared to obvious clients and servers). It is a thorny question and ØMQ doesn't really provide a single clear answer.

TCP, which is the most commonly-used transport in ØMQ, is not symmetric; one side must bind and one must connect and though ØMQ tries to be neutral about this, it's not. When you connect, you create an outgoing message pipe. When you bind, you do not. When there is no pipe, you cannot write messages (ØMQ will return EAGAIN).

Developers who study ØMQ and then try to create N-to-N connections between sets of equal peers often try a ROUTER-to-ROUTER flow. It's obvious why: each peer needs to address a set of peers, which requires ROUTER. It usually ends with a plaintive email to the list.

My conclusion after trying several times from different angles is that ROUTER-to-ROUTER does not work. And the ØMQ reference manual does not allow it when it discusses ROUTER sockets in zmq_socket(). At a minimum, one peer must bind and one must connect, meaning the architecture is not symmetrical. But also because you simply can't tell when you are allowed to safely send a message to a peer. It's Catch-22: you can talk to a peer after it's talked to you. But the peer can't talk to you until you've talked to it. One side or the other will be losing messages and thus has to retry, which means the peers cannot be equal.

I'm going to explain the Harmony pattern, which solves this problem, and which we use in Zyre.

We want a guarantee that when a peer "appears" on our network, we can talk to it safely, without ØMQ dropping messages. For this, we have to use a DEALER or PUSH socket which connects out to the peer so that even if that connection takes some non-zero time, there is immediately a pipe, and ØMQ will accept outgoing messages.

A DEALER socket cannot address multiple peers individually. But if we have one DEALER per peer, and we connect that DEALER to the peer, we can safely send messages to a peer as soon as we've connected to it.

Now, the next problem is to know who sent us a particular message. We need a reply address, that is the UUID of the node who sent any given message. DEALER can't do this unless we prefix every single message with that 16-byte UUID, which would be wasteful. ROUTER does, if we set the identity properly before connecting to the router.

And so the Harmony pattern comes down to:

One ROUTER socket that we bind to a ephemeral port, which we broadcast in our beacons.
One DEALER socket per peer that we connect to the peer's ROUTER socket.
Reading from our ROUTER socket.
Writing to the peer's DEALER socket.
Next problem is that discovery isn't neatly synchronized. We can get the first beacon from a peer after we start to receive messages from it. A message comes in on the ROUTER socket and has a nice UUID attached to it. But no physical IP address and port. We have to force discovery over TCP. To do this, our first command to any new peer we connect to is an OHAI command with our IP address and port. This ensure that the receiver connects back to us before trying to send us any command.

Breaking this down into steps:

If we receive a UDP beacon we connect to the peer.
We read messages from our ROUTER socket, and each message comes with the UUID of the sender.
If it's an OHAI message we connect back to that peer if not already connected to it.
If it's any other message, we must already be connected to the peer (a good place for an assertion).
We send messages to each peer using a dedicated per-peer DEALER socket, which must be connected.
When we connect to a peer we also tell our application that the peer exists.
Every time we get a message from a peer, we treat that as a heartbeat (it's alive).
If we were not using UDP but some other discovery mechanism, I'd still use the Harmony pattern for a true peer network: one ROUTER for input from all peers, and one DEALER per peer for output. Bind the ROUTER, connect the DEALER, and start each conversation with an OHAI equivalent that provides the return IP address and port. You would need some external mechanism to bootstrap each connection.

Easy ØMQ Patterns


A subscriber can connect to more than one publisher, using one 'connect' call each time. Data will then arrive and be interleaved ("fair-queued") so that no single publisher drowns out the others.
If a publisher has no connected subscribers, then it will simply drop all messages.
If you're using TCP, and a subscriber is slow, messages will queue up on the publisher. We'll look at how to protect publishers against this, using the "high-water mark" later.
From ØMQ 3.x, filtering happens at the publisher side, when using a connected protocol (tcp: or ipc:). Using the epgm:// protocol, filtering happens at the subscriber side. In ØMQ/2.x, all filtering happened at the subscriber side.


Avoiding Classloader leaks

"OutOfMemoryError: PermGen" is a very common message to see after a few redeploys. The reason why it's so common is that it's amazingly easy to leak a class loader. It's enough to hold a single outside reference to an object instantiated from a class loaded by the said class loader to prevent that class loader from being GC-d.

How to become a better programmer?

The below article is okay and I don't completely agree with everything he says but worthwhile...

1. It doesn’t matter how many years experience in carpentry you have had or how well you can design furniture or cabinetry if every time you try to cut wood you struggle with making the cuts.
Cutting wood is a base skill of carpentry, just like problem solving is the base skill of software development.

2. There is probably no more important skill in life than learning to learn.

3. When people ask me what I do all day, I mostly say “read things other people name and name things.”

You may have heard someone say there is a difference between a programmer with 10 years of experience and a programmer with 1 year of experience 10 times.

Different types of garbage collection -- quite interesting.

Mark and sweep (MS)

Starting from a known root set, the GC traces all reachable memory objects by following pointers. Objects reached in this way, and therefore visible for use by the program, are alive. Objects which are not reached in the trace are marked dead. In the second stage, sweep, all dead objects are destroyed and reclaimed.

Tri-color mark and sweep

Instead of a simple separation of marked (as live) and unmarked (dead), the object set is divided into three parts: white, gray, and black. The white objects are presumed dead. The gray objects have been marked as live by some other object, but haven't yet marked the objects they refer to. The black objects are live, and have marked all objects they directly refer to.

In the initial run, all objects start as white and the root set is marked gray. The marking process changes white objects to gray (marking them from another gray object), and gray objects to black (when all objects they refer to are marked). When the gray set is empty, all live objects have been marked and the white set can be collected. After a collection run, all black objects are reset to white, the root set to gray, and the process begins again.

The advantage of a tri-color mark over a simple mark is that it can be broken into smaller stages.

Copying collection

A copying GC copies objects from one memory region to another during the mark phase. At the end of the mark, all memory in the old region is dead and the whole region can be reclaimed at once.

Compacting collection

The compacting GC moves live objects close together in a single region in memory. This helps to elimianate fragmented free space and allows the allocation of large live objects. Compacting and copying collectors are often similar or even identical in implementation.


An uncooperative GC is implemented as a separate module, often without affecting the remainder of the program. The programmer can write software without needing to be aware of the operations or implementation of the GC. The alternative is a cooperative GC, which is often implemented as a reference counting scheme and requires GC-related logic to be dispersed throughout the entire program.


A common disadvantage of a simple mark implementation is that the entire system (including all threads that use the same memory pools) must be suspended while the whole memory set is examined during marking and collection. Normal operation continues only after the whole GC cycle is performed. This can lead to arbitrarily long pauses during program execution.


In order to alleviate the arbitrarily long pauses in a stop-the-world GC, the incremental GC breaks the mark and sweep process up into smaller, shorter phases. Each GC phase may still require the entire program to pause, but the pauses are shorter and more frequent.


The pauses caused by GC don't exceed a certain limit.


The object space is divided between a young generation (short-lived temporaries) and one or more old generations. Only young generations are reset to white (presumed dead). The older generations are scanned less often because it is assumed that long-lived objects tend to live longer.


GC marking and collection runs as a separate thread, sometimes with multiple threads participating in GC. On a multi-processor machine, concurrent GC may be truly parallel.


A conservative GC traces through memory looking for pointers to living objects. The GC does not necessarily have information about the layout of memory, so it cannot differentiate between an actual pointer and an integral value which has the characteristics of a pointer. The Conservative GC follows a policy of "no false negatives" and traces any value which appears to be a pointer.


A precise GC has intimate knowledge of the memory layout of the system and knows where to find pointers. In this way the precise collector never has any false positives.

Running the JVM with large amounts of RAM.

Got me to think about the cost of GCing a large heap which depending on the size could take minutes when a full GC occurs.

Read about Azul and there pauseless collector. Along with there hardware and specialized instructions to go along with it.

If your application is not interactive, and GC pauses are not an issue for you, there shouldn't be any problem for 64-bit Java to handle very large heaps, even in hundreds of GBs. We also haven't noticed any stability issues on either Windows or Linux.

However, when you need to keep GC pauses low, things get really nasty:

Forget the default throughput, stop-the-world GC. It will pause you application for several tens of seconds for moderate heaps (< ~30 GB) and several minutes for large ones (> ~30 GB). And buying faster DIMMs won't help.

The best bet is probably the CMS collector, enabled by -XX:+UseConcMarkSweepGC. The CMS garbage collector stops the application only for the initial marking phase and remarking phases. For very small heaps like < 4 GB this is usually not a problem, but for an application that creates a lot of garbage and a large heap, the remarking phase can take quite a long time - usually much less then full stop-the-world, but still can be a problem for very large heaps.

When the CMS garbage collector is not fast enough to finish operation before the tenured generation fills up, it falls back to standard stop-the-world GC. Expect ~30 or more second long pauses for heaps of size 16 GB. You can try to avoid this keeping the long-lived garbage production rate of you application as low as possible. Note that the higher the number of the cores running your application is, the bigger is getting this problem, because the CMS utilizes only one core. Obviously, beware there is no guarantee the CMS does not fall back to the STW collector. And when it does, it usually happens at the peak loads, and your application is dead for several seconds. You would probably not want to sign an SLA for such a configuration.

Well, there is that new G1 thing. It is theoretically designed to avoid the problems with CMS, but we have tried it and observed that:

Its throughput is worse than that of CMS.
It theoretically should avoid collecting the popular blocks of memory first, however it soon reaches a state where almost all blocks are "popular", and the assumptions it is based on simply stop working.
Finally, the stop-the-world fallback still exists for G1; ask Oracle, when that code is supposed to be run. If they say "never", ask them, why the code is there. So IMHO G1 really doesn't make the huge heap problem of Java go away, it only makes it (arguably) a little smaller.
If you have bucks for a big server with big memory, you have probably also bucks for a good, commercial hardware accelerated, pauseless GC technology, like the one offered by Azul. We have one of their servers with 384 GB RAM and it really works fine - no pauses, 0-lines of stop-the-world code in the GC.

Write the damn part of your application that requires lots of memory in C++, like LinkedIn did with social graph processing. You still won't avoid all the problems by doing this (e.g. heap fragmentation), but it would be definitely easier to keep the pauses low.

Oracle is trying to address the problem

Java 9

"At JavaOne 2011, Oracle discussed features they hope to have in Java 9, including better support for multi-gigabyte heaps, better native code integration, and a self-tuning JVM."

JVM Fixed Upper Limit for Memory Usage

Well, there is a paper from Sun (eh, Oracle) that explains a lot about GC internals: . There it says By default, the virtual machine grows or shrinks the heap at each collection to try to keep the proportion of free space to live objects at each collection within a specific range. (section 4.1). So it's not really "full GC before heap increase", rather it's "heap increase if full GC does not free up enough"

Why is memory management so visible in Java VM?

Java gives you a bit more control about memory -- strike one for people wanting to apply that control there, vs Ruby, Perl, and Python, which give you less control on that. Java's typical implementation is also very memory hungry (because it has a more advanced garbage collection approach) wrt the typical implementations of the dynamic languages... but if you look at JRuby or Jython you'll find it's not a language issue (when these different languages use the same underlying VM, memory issues are pretty much equalized). I don't know of a widespread "Perl on JVM" implementation, but if there's one I'm willing to bet it wouldn't be measurably different in terms of footprint from JRuby or Jython!

Python/Perl/Ruby allocate their memory with malloc() or an optimization thereof. The limit to the heap space is determined by the operating system rather than the VM, so there's no need for options like -Xmxn. Also, the garbage collection is simpler, based mostly on reference counting. So there's a lot less to fine-tune.

Furthermore, dynamic languages tend to be implemented with bytecode interpreters rather than JIT compilers, so they aren't used for performance-critical code anyway.

"So why does the JVM have (need?) a ceiling at all? Why can't it be flexible enough to request more memory from the OS when the need arises?" The Sun JVM can be easily configured to do just that. It's not the default because you have to be careful that your process doesn't cause the OS to thrash

Changing JVM is not a panacea. You can get new unexpected issues (e.g. see an article about launching an application under 4 different JVM).

You can have a class leak (e.g. via classloaders) that mostly often happen on redeploy. Frankly, I've never saw working hot redeploy on Tomcat (hope to see one day).
You can have incorrect JVM paramaters (e.g. for Sun JDK 6 64 bits -XX:+UseParNewGC switch leads to leak PermGen segment of memory. If you add additional switches: -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled-XX:+CMSPermGenSweepingEnabled the situation will be resolved. Funny, but I never met above mentioned leak with Sun JDK 6 32 bits). Link to an article "Tuning JVM Garbage Collection for Production Deployments".
PermGen chunk can be not enough to load classes and related information (actually that most often happens after redeploy under Tomcat, old classes stay in memory and new ones are loading) -- Great Article!!

Ming ODM

Ming is a Python toolkit providing schema enforcement, an object/document mapper, an in-memory database, and various other goodies developed at SourceForge during our rewrite of the site from a PHP/Postgres stack to a Python/MongoDB one.

While this dynamic behavior is handy in a rapid development environment where you might delete and re-create the database many times a day, it starts to be a problem when you need to make guarantees of the type of data in a collection (because you code depends on it). The goal of Ming is to allow you to specify the schema for your data in Python code and then develop in confidence, knowing the format of data you get from a query

Spring Integration vs. Apache Camel for EIP

After reading the below comparison I am leaning toward Camel for Integration patterns. I have used spring integration some in the past and the configuration was quite verbose. Another benefit of Camel is its wide range of endpoints going from supporting Twitter all the way to Hadoop.

Some other messaging technologies


OpenAMQ - Seems this was made right around the time ZeroMQ started


ZeroMQ Benefits and History

iMatix is behind Zeromq and looks like a consulting company for it which specializes in using it for the financial industry. "ZeroMQ is used by around 160 firms, estimates FastMQ CEO Martin Sustrik,
in everything from game servers to scientific computing to the sector
it's really meant for, financial market data.  He continues, people
appreciate how light and fast this software is.  It runs on practically
every system, speaks every language, and is very, very fast."

Zeromq had a major incompatibilities between versions of the api even making changes to the wire protocol. These major changes have stopped though and the project was forked to create

Zeromq should now be much more stable between versions

You don't have to make sure that the server is bound to the socket before clients accept

Con apparently?

what happens if a disconnect happens between the send and the recv?
Answer: Seems that you can set a timeout or use a Poller - This part of the guide talks alot about the problems this guy was having.

Monday, December 17, 2012

Load testing Web Applications and Client Side Javascript

Two topics here.
 load testing on the server
 load testing on the front end

Load testing on the Server.

This is kind of in the middle.
For load testing the server, verifying responses, and interacting with the page returned. Multi-mechanize looks pretty sweet. Its a python library that reminds me of selenium somewhat.

Load testing on the front end

Jiffy is an end-to-end real-world web page instrumentation and measurement suite. Jiffy is a novel idea in load testing tools instead of measuring the performance of the web server. We are measuring the time it takes to load the web page on the client and run the javascript.

This would probably be a nice setup at a company if you had a great set of selenium tests that were maintained. This would most likely take a while to run and by running them in parallel you could definitely speed up the process.

Selenium RC is a project for language bindings to selenium.

If you have money to throw at the problem, this may be of interest.

Latent Dirichlet Allocation and Bayes Thereom

Bayesian Filtering - Uses a Naive Bayes Classifier to determine the likelihood of an email being spam or non-spam based upon the statistical likelihood of tokens in the email. In probability theory and statistics, Bayes' theorem (alternatively Bayes' law) is a theorem with two distinct interpretations. In the Bayesian interpretation, it expresses how a subjective degree of belief should rationally change to account for evidence. In the frequentist interpretation, it relates inverse representations of the probabilities concerning two events. In the Bayesian interpretation, Bayes' theorem is fundamental to Bayesian statistics, and has applications in fields including science, engineering, economics (particularly microeconomics), game theory, medicine and law. The application of Bayes' theorem to update beliefs is called Bayesian inference. Good explanation of Bayes Thereom along with an example. Still not 100% sure how Latent Dirichlet Allocation is related to Bayesian Filtering?

Git grep

Great in depth article on browser internals

Tools and Ideas I would use/look at for ideas if I were developing my own project

-travis ci
gclient: Meta-checkout tool managing both subversion and git checkouts. It is similar to repo tool except that it works on Linux, OS X, and Windows and supports both svn and git. This is nice since people waist alot of time checking out multiple repos to get started on a project.
Think about how exceptions/errors will be handled aka on of the following:
1. simply let them occur and search logs for them
2. maybe stick them in some of kind of database so you can easily look up common errors
3. Always catch them and just log them vs. 1 were they would probably cancel the request once they occurred in a web app

Major Browsers and Layout Engines Quick Reminder

Mozilla Firefox uses the Gecko layout engine IE since 4.0 uses the Trident Engine Safari, Chrome, iPad, iPhone, and Android Web browsers all use Webkit

Leave off the scheme in a URI

<script src="//"></script>

Steve Yegge Notes from the Mystery Machine Bus

Assembly language: Batshit liberal.

Perl, Ruby, PHP, shell-script: Extremist liberal.

JavaScript, Visual Basic, Lua: Hardcore liberal.

Python, Common Lisp, Smalltalk/Squeak: Liberal.

C, Objective-C, Scheme: Moderate-liberal.

C++, Java, C#, D, Go: Moderate-conservative.

Clojure, Erlang, Pascal: Conservative.

Scala, Ada, OCaml, Eiffel: Hardcore conservative.

Haskell, SML: Extremist conservative.

As much as some commenters are (weirdly?) railing against this classification scheme I think the underlying idea that software conservatism is about risk aversion is essentially accurate.
Perhaps another way of framing this is to ask the question: are you optimizing for the best case or the worst case? This ultimately is a form of risk management. And I'm not talking in the algorithmic sense, meaning complexity expressed as the asymptotically worst case. I'm talking about people, software and ecosystems.
Let me illustrate this idea with Java.
- C++ has operator overloads. Java does not? Why? Because people might abuse them. That's optimizing for the worst case (ie bad or inexperienced programmers). Properly used, operator overloading can lead to extremely readable code;
- Java has checked exceptions and uses them liberally (pun intended). C#, as one example, only has unchecked exceptions. Why? Philosophically the Java language designers (and many of its users) feel that this forces callers to deal with exceptions. Pragmatically (IMHO) it does not and leads to more cases of exceptions being simply swallowed. But again this is optimizing for the worst case ie programmers who should deal with a particular error condition but won't;
- Java has no multiple inheritance. Same story: it can be abused ("it is known"). But also mixins can be a powerful metaphor.
- Rinse and repeat for duck typing, extension methods, etc.
Putting Python two steps from Ruby strikes me as an interesting choice. I'd say the difference is at most one.
I'll also agree that Google as a company (based on my own much more limited experience than Yegge's) is firmly conservative. The style of writing Javascript that he refers to is about writing Google Closure code with all sorts of directives to aid the Closure Compiler (I describe Closure as putting the Java back into Javascript).

Paul Graham Reading

Nerds don't just happen to dress informally. They do it too consistently. Consciously or not, they dress informally as a prophylactic measure against stupidity.

A nerd, in other words, is someone who concentrates on substance.

The prospect of technological leverage will of course raise the specter of unemployment. I'm surprised people still worry about this. After centuries of supposedly job-killing innovations, the number of jobs is within ten percent of the number of people who want them. This can't be a coincidence. There must be some kind of balancing mechanism.

LZW Compression front to back for JSON communication

Doing LZW compression front to back now, mostly just obfuscates but does provide some compression. I saw the fingerprint go down from~ 6k to 4k after compression. Big win its fast on the order of < 10ms to run the compression algorithm. The code is also really small to do the compression.

Vim vertical edits

Type a command, say for instance '8x' to delete 8 characters before the cursor
Then enter visual block mode with 'v' select the text you want but do not leave this mode
Press ':' then type 'normal .' this will repeat your last command on the selected block

Useful resource Vim Character Patterns

Cookies are disabled?

A standard way of checking for cookie support is via a redirect.

For reasons I'll explain below, I think it's best to do a cookie check only when the user initiates an action that would require a cookie such as attempting to log in, or adding something to their cart.

First, the server checks the login data as normal - ie if the login data is wrong the user receives that feedback as normal. It immediately responds with a cookie, and a redirect to a page which is designed to check for cookie preferences - which may just be the same URL but with some flag added to the query string. This next page will then check to see if the client sent any cookie. If not, then the user receives a message stating that a cookie was not received and they should probably try to enable cookies if they want to log in.

Now for why I only do a cookie test after a user-initiated action other than simply loading a page. I have seen sites implement a cookie test on every single page, not realising that this is going to have effects on things like search engines trying to crawl the site. That is, if a user has cookies enabled, then the test cookie is set once, so they only have to endure a redirect on the first page they request and from then on there are no redirects. However, for any browser or other user-agent, like a search engine, that doesn't return cookies, every single page could have a redirect. While it'll still work and a lot of the time users won't see any difference, it is a lot more overhead and load than necessary.

Another method of checking for cookie support is with Javascript - this way, no redirect is necessarily needed - you can write a cookie and read it back virtually immediately to see if it was stored and then retrieved. The downside to this is it runs in script - ie if you still want the message about whether cookies are supported to get back to the server, then you still have to organise that - such as with an Ajax call.

For my own application, I implement some protection for 'Login CSRF' attacks, a variant of CSRF attacks, by setting a cookie containing a random token on the login screen before the user logs in, and checking that token when the user submits their login details. Read more about Login CSRF from Google. A side effect of this is that the moment they do log in, I can check for the existence of that cookie - an extra redirect is not necessary.

Unique Short URLs

Today I was trying to create a unique short url.

I was using the UUID class provided by java earlier for id generation. This produced ids that were too large for our purposes.
So they recommended using something they were using on another project, hash(ip + timeofvisit)
I ended up using sha1(ip + timeofvisit), cut this in half from a 20 byte[] to ten bytes. Finally base64 encode the bytes into a url safe string.

Later I got into a discussion about why I was using base 64 encoding to shorten the length of the string. Here it goes.
My point was that if you started off with the md5 hash (which produces a 128 bit digest) in a byte []
Then of the two representations, hex coded string and base 64 encoding, the base 64 version would be a smaller string.

My partner argued the below:
As this shows base64 encoding a STRING obviously causes it to get larger.

dhcp199:apache-tomcat-7.0.33 randy$ php -r "echo md5('123').PHP_EOL; echo base64_encode(md5('123')).PHP_EOL;"

Here below you see my point. By passing true for raw_output the base64 encoded version is shorter.

randys-MacBook-Air:~ randy$ php -r "echo md5('123').PHP_EOL; echo base64_encode(md5('123', true)).PHP_EOL;"

Good python code on here for generating unique random looking ids from some sequential key

Monday, December 3, 2012

Illegal Characters in Cookies

Today I had a issue when setting a cookie in the browser where the server would simply not recognize that I had set the cookie. I was running Tomcat 7 and after a bunch of debugging I realized that it was because I had an @ sign in the cookie value. Interestingly tomcat didn't show an error it just ignored the cookie which was quite annoying.


public void setValue(String newValue)
Assigns a new value to a cookie after the cookie is created. If you use a binary value, you may want to use BASE64 encoding.With Version 0 cookies, values should not contain white space, brackets, parentheses, equals signs, commas, double quotes, slashes, question marks, at signs, colons, and semicolons. Empty values may not behave the same way on all browsers.
newValue - a String specifying the new value
See Also:

Saturday, December 1, 2012

Intel’s Haswell is an unprecedented threat to Nvidia, AMD

Transactional Synchronization eXtensions

Transactional Synchronization eXtensions (TSX) extend the x86 ISA with two new interfaces: HLE and RTM.

Restricted Transactional Memory (RTM) uses Xbegin and Xend, allowing developers to mark the start and end of a critical section. The CPU will thread this piece of code as an atomic transaction. Xbegin also specifies a fall back path in case the transaction fails. Either everything goes well and the code runs without any lock, or the shared variable(s) that the thread is working on is overwritten. In that case, the code is aborted and the transaction has failed. The CPU will now execute the fall back path, which is most likely a piece of code that does coarse grained locking. RTM enabled software will only run on Haswell and is thus not backwards compatible, so it might take a while before this form of Hardware Transactional Memory is adopted.

The most interesting interface in the short term is Hardware Lock Elision or HLE. It first appeared in a paper by Ravi Rajwar and James Goodman in 2001. Ravi is now a CPU architect at Intel and presented TSX together with his colleague Martin Dixon TSX at IDF2012.
The idea is to remove the locks and let the CPU worry about consistency. Instead of assuming that a thread should always protect the shared data from other threads, you optimistically assume that the other threads will not overwrite the variables that the thread is working on (in the critical section). If another thread overwrites one of those shared variables anyway, the whole process will be aborted by the CPU, and the transaction will be re-executed but with a traditional lock.
If the lock removing or elision is successful, all threads can work in parallel. If not, you fall back to traditional locking. So the developer can use coarse grained locking (for example locking the entire shared structure) as a "fall back" solution, while Lock Elision can give the performance that software with a well tuned fine grained locking library would get.

According to Ravi and Martin, the beauty is that the developer of your locking libraries simply has to add a few HLE instructions without breaking backwards compatibility. The developer uses the new TSX enabled library and gets the benefits of TSX if his application is run on Haswell or a later Intel CPU.

Javascript class with {SUPER: SYSTEM}

Defines some nice ways to declare classes in Javascript with true instance variables, and privates properties.

Javascript hoisting and scoping

var a = 1;
function b() {
a = 10;
function a() {}

function a() {} is actually defined first and therefore the assignment to a = 10 is ignored

Jquery deferreds with pipe and when
function getCustomerSSNById(customerId){
    var deferred = $.Deferred();
    setTimeout(function() {deferred.resolve({"ssn" : "111 44 9999"});}, 300);
    return deferred.pipe(function(p){
        return p.ssn;

function getPersonAddressBySSN(ssn){
    var deferred = $.Deferred();
    console.log("ssn is " + ssn);
    setTimeout(function() {deferred.resolve({"address" : "123 blah st"});},1000);
    return deferred.pipe(function(p){
        return p.address;

function getPersonAddressById(id){
    return getCustomerSSNById(id).pipe(getPersonAddressBySSN);

        alert("The address is " + a);

And using when

$.when(getCustomerSSNById(123), getPersonAddressBySSN("123 45 6789"))
    .done(function(person, address){
        alert("The name is " + person.ssn + " and the address is " + address);

Using the google closure compiler

java -jar ~/apps/google-closure-compiler/compiler.jar --compilation_level ADVANCED_OPTIMIZATIONS --js=../src/main/webapp/resources/evercookie/evercookie.js --js_output_file=out_advanced.js

You must define externs if you plan to compile your code separate from say for instance jquery.
--externs src/main/config/jquery-1.8.js

Make sure to define global properties that you need between source files like below for the closure compiler to work properly with
advanced optimizations
window['someobject'] = someobject;


Five of the top seven carriers in the U.S. use CDMA: Verizon Wireless, Sprint, MetroPCS, Cricket, and U.S. Cellular.

AT&T and T-Mobile use GSM.

Wondered why Amazon ec2 virtual machine instances were so fast?

Read an article that said that they used Xen Hypervisor
Take a look at Xen. It's interesting because it offers better performance than traditional
virtualisation systems by modifying both host OS and guest OS. It's really interesting to learn about
its optimisation techniques.
Apparently KVM is going to be the upcoming technology.

Hypervisor and does the guest os run in protected mode?

Seems it does and they are different levels of protected mode, 0 for the kernel and 3 for applications.
Intel and AMD both implement different ways to support virtualization, one way has a protection level of -1.

More on Tor security

This article talks about Tor in relation to what it prevents, and gives a high level overview of how to tracks users
panopticlick eff browser fingerprinting

Tor Inner workings

Seems to be related to how tor works behind a NAT

jsPerf — JavaScript performance playground

jsPerf aims to provide an easy way to create and share test cases, comparing the performance of different JavaScript snippets by running benchmarks.

Writing Fast, Memory-Efficient JavaScript


trace-deopt looks interesting to see what code it had to deoptimize

very interesting a tool that detects what objects are causing leaks

Resource Timing
Navigation Timing

MultiLevel source mapping in Chrome

Pretty amazing stuff the chrome developer tools team is doing, you can go from coffeescript down to minified js in the browser. This allows you to view coffeescript in the browser via the source maps but really there is minified js running in the browser.

The Breakpoint Ep 3: The Sourcemap Spectacular with Paul Irish and Addy Osmani

Travis CI

Continuous integration platform integrates well with github, pretty hot stuff.

Saturday, November 10, 2012

Thursday, November 8, 2012

B Tree Algorithms

Binary Tree a binary tree is a tree data structure in which each node has at most two child nodes

Binary Search Tree is a sorted node-based binary tree data structure which has the following properties

B-Tree The B-tree is a generalization of a binary search tree in that a node can have more than two children

B+ Tree The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context—in particular, filesystems. This is primarily because unlike binary search trees, B+ trees have very high fanout (typically on the order of 100 or more), which reduces the number of I/O operations required to find an element in the tree.

B+ Tree are used by many file systems as well as DB's for internal storage.

Tuesday, November 6, 2012

Protocols of the Semantic Web

  1. Microformats
  2. Open Graph Protocol
    1. meta property="og:title" content="Take a Bite!" meta property="og:type" content="university" meta property="og:url" content="" meta property="og:image" content="cookie.png" meta property="og:site_name" content="Take a Bite!" meta property="og:description" content="Today, most webpages use cookies to keep track of and identify its user. We want to change this. Help us by taking a bite. " meta property="fb:admins" content="677736282,659997616"

Monday, November 5, 2012

HTML5, JS mobile frameworks

There is a couple of choices out there when it comes to HTML5, Javascript mobile frameworks.

We have phone gap, and the-m-project which is based on phone gap. Both of these package up your web application into a in effect a wrapper for each of the mobile platforms, android, ios, ...

The appcelerator is completely different it actually builds native code from you javascript. This in my mind could be likened to something similar to GWT which converts java -> javascript.

Another good comparison.
Seems phonegap is the most stable.

Intel Haswell

New power saving innovations on Haswell

The New Sleep States: S0ix

Panel Self Refresh Technology

Google Maps API Issues

Over the weekend I attended the DCWEEK Hackathon and while there developed an application that made heavy use of google maps. I ran into some issues when integrated jquery mobile and google maps, I found the following article after the fact that details alot of the problems I had.

The problems boil down to NOT having the div visible that the map will display into when google maps is initialized.


  1. Off-left technique
    1. .ui-tabs .ui-tabs-hide {
          position: absolute;
          left: -10000px;
  2. Resize the map when idle
    1. The below can also be combined with a setTimeout, which I had to use to get it to work when deployed to a phone browser. HACK I know
    2. google.maps.event.addListenerOnce(map, 'idle', function() {
          google.maps.event.trigger(map, 'resize');
          map.setCenter(point); // be sure to reset the map center as well
  3. Asynchronously Loading the API

Varnish Cache

Varnish Cache which allows you to cache dynamic and static content. It has a pretty easy configuration language allowing you to configure how to cache things such as user specific content as well as general dynamic content. Pretty sweet stuff if you need to scale out your site quickly.

Mongodb Aggregation Examples

Provides the ability to write queries similar to using group by in SQL without using map-reduce.

Friday, November 2, 2012

Microbenchmarking on the JVM and Long vs Int

While watching a video on lock free hashtables I noticed that you can use & instead of % and gain a performance bump. So I did some benchmarks. Turns out that with modern hardware (I am using a core i5) that the difference between & and modulo wasn't that big. What I found interesting though was when I tried to up the number of loop iterations I had to switch to a long which made a large difference. Seems that even on a 64bit architecture with a 64bit JVM a loop with a long as the iteration variable was significantly slower. Any ideas?

class blah {
    public static void main(String[] args) {
        System.out.println("Starting timing of modulo version. \n");
        long startTime = System.nanoTime();
      long endTime = System.nanoTime();

      long duration = endTime - startTime;
      System.out.println("Modulo version took " + duration + ". \n");
      System.out.println("Starting timing of and version. \n");
        startTime = System.nanoTime();
      endTime = System.nanoTime();

      duration = endTime - startTime;
      System.out.println("And version took " + duration + ". \n");
      System.out.println("Starting timing of int version. \n");
        startTime = System.nanoTime();
      endTime = System.nanoTime();

      duration = endTime - startTime;
      System.out.println("Int version took " + duration + ". \n");
      System.out.println("Starting timing of long version. \n");
        startTime = System.nanoTime();
      endTime = System.nanoTime();

      duration = endTime - startTime;
      System.out.println("Long version took " + duration + ". \n");

 static void modulofunc() {
   int s = 32;
   int e;
   for( int i = 0; i < 1_000_000_000; i++) {
     e = s % 7;
 static void andfunc() {
   int s = 32;
   int e;
   for( int i = 0; i < 1_000_000_000; i++) {
     e = s & 7;
 static void intfunc() {
   int s = 32;
   int e;
   for( int i = 0; i < 1_000_000_000; i++) {
     e = s & 7;
 static void longfunc() {
   int s = 32;
   int e;
   for( long i = 0; i < 1_000_000_000L; i++) {
     e = s & 7;

Lock-free replacement for java.util.Hashtable

Watched the below google tech talk of a lock free hashtable that uses compare-and-swap CAS instructions to achieve consistency when accessed by multiple threads. It is also a probing algorithm when it comes to collision resolution. What is interesting is how well it scales up when multiple threads are trying to write to it.

Google tech talk

Wednesday, October 31, 2012

Bilby.js and Functors, Applicatives, and Monads

Research to understand Functors, Applicatives, and Monads

  • Decided I had to learn a real functional language like Scala or Haskell to understand the concepts. 
  • Picked Scala to learn the more advanced functional programming constructs.
  • History: Haskell Curry  is the inventor of Haskell and the technique of currying is also named after him.

Tuesday, October 30, 2012

DataTables (table plug-in for JQuery)

DataTables is a plug-in for the jQuery Javascript library. It is a highly flexible tool, based upon the foundations of progressive enhancement, which will add advanced interaction controls to any HTML table. Key features:

Here is the link.
It integrates with Twitter bootstrap!


Github example to come on using knockout, upshot, nodejs, and mongo to build a demo spa.

Upshot is powerful DAL (Data Access Layer) javascript library. It knows how to talk to server or local data storage. It knows how to the interprete data from the server, understanding the metadata for things like validation and association. It knows how to track changes and synchronize those data to sever.

In case of server data storage, upshot can talk to json/xml data services like web api. And in case of local storage, upshot can use html5 local storage. There is even possibility to use local storage when there is no connection with server, and synchronize all changes to server as soon as server is available. This is the great library that makes building SPA application as easy as building normal web application.

If bufferChanges attribute is set to true, Html.UpshotContext(bufferChanges: true), Upshot won’t synchronize changes with the server until we explicitly ask it to. But if set it to false, Html.UpshotContext(bufferChanges: false), changes are immediately send to server. This is a great funcionality if we want to do bulk insert/update, which is great enhancement in terms of performance because it reduces number of http request to server.

This can also perform query against data like sorting, filtering etc.. Moreover, it can perform query against remote data as well as local data, so you don´t have to go to remote if you just want to filter over data that is already in the local.

Knockout video that led me to upshot.

Great tutorial on upshot.js

Counter part

JSONP with JQuery

"If the URL includes the string "callback=?" (or similar, as defined by the server-side API), the request is treated as JSONP instead. See the discussion of the jsonp data type in $.ajax()for more details." According to


Request Component

The amplify.request component sets out to make data retrieval more maintainable. It does this by separating the definition of a request from the actual implementation of requesting the data. The definition of the request is only responsible for deciding the caching strategy, server location, data type, decoders, headers, etc. While the actual requestor code only needs to know a request ID. This allows for a data layer which can make your application much easier to test and more agile to change.

Request Define API

The role of amplify.request.define is to allow you to focus on only defining the implementation of retrieving data. The API for amplify.request.define can be very simple, but also has numerous optional parameters that you can leverage, including some that you can provide custom implementations of your own.

Define a Request

  1. amplify.request.define( string resourceId, string requestType [, hash settings ] )
  • resourceId: Identifier string for the resource
  • requestType: The type of data retrieval method from the server. There is currently one built in requestType (ajax), but you can also define your own.
  • settings (optional): A set of key/value pairs that relate to the server communication technology.
    • Any settings found in jQuery.ajax()
    • cache: Providing different caching algorithms for the request and response
      • boolean: Cache the data in-memory for the remainder of the page load
      • number: Cache the data in-memory for the specified number of milliseconds
      • string: Persistent client-side cache using
        • "persist": Will use the first available storage type as default
        • "localStorage", "sessionStorage", etc. ... 
    • decoder: A way to parse an ajax response before calling the success or error callback. There is currently one built in decoder (jSend), but you can also define your own. By default no decoder is used unless you set it.
      • "jsend": A specification on how JSON responses should be formatted
Example Usage
  1. amplify.request.define( "getContactDetails""ajax"{
  2.   //Amplify will replace {id} with data passed to it
  3.   url: "/Contact/Details/{id}"
  4.   dataType: "json",
  5.   type: "GET"
  6.   //Response will be cached for 15 seconds
  7.   cache: 15000     
  8. });

Request API

Once you've defined your request, then the next step would be to go ahead and call the amplify.request method. Thankfully, most of the hard part was defining the definition of the request, so calling amplify.request is fairly straightforward.

Simplified Request

  1. amplify.request( string resourceId 
  2.   [, hash data [, function callback ]] )
  • resourceId: Identifier string for the resource
  • data (optional): an object literal of data to be sent to the resource
  • callback (optional): a function to call once the resource has been retrieved
Example Usage
  1. amplify.request( "getContactDetails"
  2.   //Amplify will resolve url to "/Contact/Details/4"
  3.   { id: 4 }
  4.   function( data ) {
  5.     console.log( data );
  6.   });

Request with Hash Settings

  1. amplify.request( hash settings )
  • settings
    • resourceId: Identifier string for the resource
    • data (optional): Data associated with the request
    • success (optional): Function to invoke on success
    • error (optional): Function to invoke on error
Example Usage
  1. amplify.request({ 
  2.   resourceId: "getContactDetails",
  3.   //Amplify will resolve url to "/Contact/Details/4"
  4.   data: { id: 4 }
  5.   success: function( data ) {
  6.     console.log( data );
  7.   },
  8.   error: function( message, level ) {
  9.     console.log( level + ": " + message );
  10.   }
  11. });

Sample Application

Now we will update the sample application from above and swap out the call to $.ajax() to use amplify.request instead.
  1. var hackerNews = (function( $, undefined ) {
  2.   var pub = {};
  4.   pub.init = function() {
  5.     //...
  7.     //Define getNews AJAX JSONP request in Amplify
  8.     amplify.request.define( "getNews""ajax"{
  9.       url: "" + 
  10.         "?format=jsonp"
  11.       dataType: "jsonp",
  12.       cache: 30000
  13.     });                
  14.   };
  16.   //...
  18.   function getNews( callback ) { 
  19.     //Request for news from Amplify        
  20.     amplify.request({
  21.       resourceId: "getNews",
  22.       success: function( data ) {
  23.         if ( callback ) callback ( data );
  24.       },
  25.       error: function( message, level ) {
  26.         console.log( level + ": " + message );
  27.       }
  28.     });        
  29.   }
  31.   //...
  33.   return pub;
  34. }( jQuery ));
  36. hackerNews.init();
  37. hackerNews.getAndDisplayNews();
You can view, run, and edit the above source code from jsFiddle.
We first defined what the request will look like and placed that in our init method. We define the request as getNews, use the built in ajax type, and then pass some options which include common $.ajax() settings such as url and dataType, but also includes a amplify.request specificcache setting where we will cache the response of the request for 30000 milliseconds (30 seconds).
After defining our request with amplify.request.define, we then swapped out the $.ajax() code that was in the getNews method and replaced it with amplify.request referencing the resourceIdwe previously defined, a success callback, and an error callback method.
With caching now on when the user clicks on the Refresh button there will be a 30 second period where no actual ajax request is being made to the server. You can verify this using your web developer tools in your browser. Once the cache has been cleared after 30 seconds, then the real request will proceed to the server and return fresh results that will be in-turn cached for another 30 seconds.
By making these two small changes the application still works as it did before and we didn't have to change any other code. The power of Amplify is in having the definition adapt with changes while leaving the request functionality untouched. Just think what might happen if the backend switched to web sockets instead or instead of JSON being returned it was XML. Towards the end of this article we will take another look at the power of making this abstraction.


  • Separate out the definition from a request from the actual request for data from the server
  • Ability to provide alternative caching and decoding implementations