Have Your Efficiency, and Flexibility Too

Metaprogramming Techniques For No-Compromise Code

by Nick Sabalausky

Full source code for this article is available on GitHub, or can be downloaded here.

View this article on a Single Page or Multiple Pages

Table of Contents:

  1. Have Your Efficiency, and Flexibility Too
  2. First Attempt: Send Efficiency and Flexibility to Dr. Oop's Couples Therapy
  3. Respecting the Classics: Old-School Handcrafting
  4. Success at Dr. Metaprogramming's Clinic
  5. It Walks Like a Duck and Quacks Like a Duck...Kill It!
  6. Metaprogramming Plus: The Flexibility Enhancements
  7. The Last Remaining Elephant In The Room: Runtime Conversion
  8. Curtain Call

Have Your Efficiency, and Flexibility Too

Efficiency and flexibility are programming's classic odd couple. Both are undeniably useful, but they never seem to get along. You try to improve one, and the other just balks and storms off. Prima donna! That might be fine for some scenes, but often you can't get by with only one: It breaks the whole dynamic!

Just look how the efficiency and flexibility knuckleheads bicker in even the simplest situations. This example is written in the D programming language, but it's the same sad old story in any language:

From ex1_original.d:

struct Gizmo { this(int numPorts, bool isSpinnable) { if(numPorts < 1) throw new Exception("A portless Gizmo is useless!"); ports.length = numPorts; _isSpinnable = isSpinnable; } private OutputPort[] ports; @property int numPorts() { return ports.length; } void doStuff() { foreach(port; ports) port.zap(); } private bool _isSpinnable; @property int isSpinnable() { return _isSpinnable; } int spinCount; void spin() { // Attempting to spin a non-spinnable Gizmo is OK. // Like insulting a fishtank, it merely has no effect. if(isSpinnable) spinCount++; // Spinning! Wheeee! } }

Ok, so I guess the fighting between efficiency and flexibility isn't so obvious at a glance. They're making some ordinary-looking Gizmos and nothing seems immediately wrong. There's no fists flying, no colorful primetime-banned language, no headlocks or piledrivers. But you have to look closer: it's passive-aggression. And experts say that's psychologically-damaging, right?

Normally, code like that would be fine, so what's the problem? Well, we don't just have two or three Gizmos collecting dust until a rare special occasion when we decide to pull one out to use it. Oh, no. Gizmos are the main component in the real product: the UltraGiz. The UltraGiz is made of thousands of Gizmos of all different types, and it really gives those Gizmos a big workout. Plus, each port-zapping is fairly quick. The real expense comes from how many port-zaps occur.

So even a small improvement to the Gizmo's size and speed will add up to a big improvement in the UltraGiz. And since many different types of Gizmos are needed, we know we need both efficiency and flexibility.

What flexibility is needed? For one, some Gizmos need to spin, and some don't. But every Gizmo is paying the price for that flexibility: There's always that spinCount variable, even for ones that don't spin. And they all have to take the time to check the isSpinnable variable. Heck, each Gizmo even has to use storage space just to know whether it's spinnable or not. They're not just stamped on the side with "spinny" or "no spinny".

And then there's the output ports. Every time any Gizmo is called upon to do its stuff, it has to check how many ports there are, zap the first one, increment some internal value, check if its done, zap the next one, etc. That's necessary for a few of the Gizmos, but most Gizmos only have one or two ports. Why can't they just zap their one or two ports and be done with it? Most Gizmos have no need for that extra overhead.

Ultimately, the problem boils down to all that flexibility coming from runtime variables. Since the flexibility happens at runtime, the compiler can't usually optimize away the speed issues. And since all the Gizmos are the same type, struct Gizmo, the compiler can't optimize the space issues either.

Let's see how the Gizmos currently perform. I'll use this test program to simulate an UltraGiz and time it:

From ex1_original.d:

struct OutputPort { int numZaps; void zap() { numZaps++; } } struct UltraGiz { Gizmo[] gizmos; int numTimesUsedSpinny; int numTimesUsedTwoPort; private void useGizmo(ref Gizmo gizmo) { gizmo.doStuff(); gizmo.spin(); if(gizmo.isSpinnable) numTimesUsedSpinny++; if(gizmo.numPorts == 2) numTimesUsedTwoPort++; } void run() { StopWatch stopWatch; stopWatch.start(); // Create gizmos gizmos.length = 50_000; foreach(i; 0..10_000) gizmos[i] = Gizmo(1, false); foreach(i; 10_000..20_000) gizmos[i] = Gizmo(1, true ); foreach(i; 20_000..30_000) gizmos[i] = Gizmo(2, false); foreach(i; 30_000..40_000) gizmos[i] = Gizmo(2, true ); foreach(i; 40_000..45_000) gizmos[i] = Gizmo(5, false); foreach(i; 45_000..50_000) gizmos[i] = Gizmo(5, true ); // Use gizmos foreach(i; 0..10_000) foreach(ref gizmo; gizmos) useGizmo(gizmo); writeln(stopWatch.peek.msecs, "ms"); assert(numTimesUsedSpinny == 25_000 * 10_000); assert(numTimesUsedTwoPort == 20_000 * 10_000); } } void main() { UltraGiz ultra; ultra.run(); // Runtime error: A portless Gizmo is useless! //auto g = Gizmo(0, true); }

On my 1.7 Ghz Celeron (Yes, I know that's old, please don't flame me!), compiling with DMD in release mode with optimizations and inlining on, my result is 21 seconds. My system's task manager tells me it used 10.4 MB of RAM. Hmm, even on my old hardware, that could really be better.

Flexibility is really starting to push his co-star's buttons. Efficiency is even getting ready to take a swing at Flexibility. Uh, oh! At this point, many people just decide to prioritize either efficiency or flexibility. Often, this comes with reasons like "Hardware just keeps getting faster" or "This is built for speed, those who need flexibility can use a slower alternative." I've never liked to compromise, and I think we can do better. So let's see if we can diffuse this odd couple's situation before it turns into an all-out brawl (and they consequently loose their lucrative time-slot due to prime-time decency standards).

Next: First Attempt: Send Efficiency and Flexibility to Dr. Oop's Couples Therapy

Table of Contents:

  1. Have Your Efficiency, and Flexibility Too
  2. First Attempt: Send Efficiency and Flexibility to Dr. Oop's Couples Therapy
  3. Respecting the Classics: Old-School Handcrafting
  4. Success at Dr. Metaprogramming's Clinic
  5. It Walks Like a Duck and Quacks Like a Duck...Kill It!
  6. Metaprogramming Plus: The Flexibility Enhancements
  7. The Last Remaining Elephant In The Room: Runtime Conversion
  8. Curtain Call