Monday, January 31, 2011

Attention all Personnel

Update your blogs, folks! I know you've been up to something; what is it? What did you learn from it?

A short observation: I discovered the level of accuracy in my technique at pool vastly improved if I decreased the distance between cue and cue ball. Short stroke trumps long stroke.

Another short observation: Under sail, when preparing for a gybe, be sure you are running directly down wind. A gybe should not involve significant rudder/tiller movements, and one's heading should not change dramatically. In this video, the nose of the boat does not turn during the gybe; the main is simply sheeted in until the wind carries it across.

Strolls through Piedmont Park are best completed with Emma and a smoothie.

Tuesday, January 18, 2011

Taste of AVX

One can use Intel's Software Development Emulator to execute programs utilizing unsupported ISA extensions such as Advanced Vector Extensions. This tool, based on Pin, instruments binaries and emulates instructions not supported in hardware. It also provides detailed instruction opcode histograms so you can, for instance, figure out whether some video game or library was compiled with SSE support.

Here's an example x86-64 function implementing element-wise multiply-accumulate of two single-precision floating-point vectors of arbitrary length.


; void vectorFMA(float *R, float *A, float *B, int N)
;
; R[i] += A[i] * B[i] for i = 0 .. N-1
;
; uses AVX to multiply two vectors, elementwise, and add
; results to a third vector
;
; rdi - R
; rsi - A
; rdx - B
; rcx - N
;
vectorFMA:

.L1:
; load words from R, A, and B
vmovups ymm1, [rdi]
vmovups ymm2, [rsi]
vmovups ymm3, [rdx]

; *R += *A * *B
vmulps ymm4, ymm2, ymm3
vaddps ymm5, ymm1, ymm4
vmovups [rdi], ymm5

; R+=8, A+=8, B+=8
add rdi, 32
add rsi, 32
add rdx, 32

; if (--rcx == 0) break;
dec rcx
jnz .L1
.L5:
ret


When our SandyBridge machine arrives next week, I should have a handful of interesting microbenchmarks to run on it.

Ferroequinology: Running Boards

We hoisted engineer's side running boards onto the brackets mounted the previous week. Now, it is possible to access the top of A&WP 290's boiler shell. We used the opportunity to heat and remove a handful of brackets to clear several sand and steam pipes.

Protip: when applying a socket wrench to a red-hot steel nut, make sure your gloves are dry to prevent any unwanted heat transfer...

Update: water soaked into gloves acts as a thermal conductor resulting in scalds if a gloved hand is applied to steel whose temperature is several hundred degrees above the boiling point of water.

Friday, January 14, 2011

Wanted

My notebook. Last seen December 6, 2010.

Constant-time Set Membership

While scouring the internet yesterday, I came upon an article written by Preston Briggs, currently at NVIDIA and a former office mate in Bellevue. Efficient representation for sparse sets.

This paper considers set representations of objects for which a bijection to the natural numbers exists and provides a novel alternative to bit vectors.

Sets represented by bit vectors are one of the more common approaches [what we use in Ocelot]. Given a universe of size u, bit vectors indicate whether element i is a member of a set by setting bit i providing access times on the order O(u). This representation also requires O(u) time to clear the bit vector initially. If short-lived sets with large u are needed frequently, overheads can be prohibitive. A sparse representation may be desirable, but it should have O(1) access times.

Preston's approach represents a sparse set with two arrays of size u, denoted dense and sparse, and an integer k representing the number of elements in the set. The array dense stores a packed sequence of elements that have been added to the set, and valid indices are 0 ... k-1. The array sparse provides a mapping of natural numbers to possible indices into dense. For example, assume j belongs to the set and is stored in location dense[p]; the following is true: sparse[j] = p.

With two array access, both O(1) we can efficiently determine whether element j is in the set:

def isIn(j, S):
return (S.sparse[j] < S.k) and (S.dense[S.sparse[j]] == j)

That is, if sparse[j] is a valid index (from 0 to k-1) and the object at that index is indeed the element, then it belongs to the set. This technique has the advantage that neither dense nor sparse needs to be cleared initially. To clear the set, simply set k = 0.

Elements are added by:

def insert(S, j):
S.dense[S.k] = j
S.sparse[j] = S.k
S.k += 1

Storage requirements are 2*(u+1) for a set of u integers. I thought this was interesting.

Tuesday, January 11, 2011

Delegates in C++

I reimplemented the Baker valve gear simulator in C++ a few weeks ago. This involved adding a nonlinear equation solver to libkerr's matrix library, my custom-made high-performance linear algebra library that's been reused over the years in dozens of small projects and a few large ones.

The algorithm itself is nothing special: Newton's method. Implementing it in a general and yet efficient manner in C++, however, presents a design choice.

I was tempted to accept a template argument and call that class's operator() member function.


template < typename Real, typename Functor >
int fsolve(matrix &x, const Functor &F, Real precision, int maxIterations) {
..
while () {
matrix fx = F(x); // operator()
}
..
}


This, however, would require a separate class for each constraint function. In my case of coupled systems, I need to determine roots for two distinct constraint functions that shared both constant state and analytically computed state. Additionally, the zeros of the first function would be needed as input to the second function.

Instead, I opted for the delegate design pattern. The functor includes both the object as well as references to member functions.


template <typename Real, typename Functor>
int fsolve(matrix<Real> &x, const Functor &F, matrix<Real> (Functor::*ptr)(matrix<Real>), Real precision, int maxIterations) {
..
while () {
matrix<Real> fx = (F.*ptr)(x); // delegate
}
..
}
..
BakerValve instance;
fsolve<double, BakerValve>(x1, instance, &BakerValve::constraintF1);
..
fsolve<double, BakerValve>(x2, instance, &BakerValve::constraintF2);


Some additional features of my solver are the Jacobian can either be estimated by a default solver or computed analytically by a user-supplied function.

The implementation seems to work well and always converge. I haven't implemented an aesthetically pleasing visualization of its output, as this was mostly scratching an itch for implementing the constraint solver.

Sunday, January 9, 2011

Ferroequinology: clearing A&WP 290's Boiler Shell

Yesterday, I visited the Southeastern Railway Museum and worked on A&WP 290 with Dale Grice.

In continuing with the goal of removing everything from the boiler shell to prepare for an eventual blast with baking soda, we removed a steam pipe carrying a "signal" that actuates cylinder cocks. A valve in the engineer's side of the cab raises steam pressure in this pipe which actuates valves at the bottoms of the main cylinders at the front of the locomotive that permit condensation and steam to blow out when the locomotive starts. This is why steam engines seem to start in a cloud of steam.

A&WP 290 venting cylinder cocks

This required removing two unions coupling a network of steel pipes. I tried without luck during a previous visit to wrench them apart. Yesterday, we used a "rosebud" oxy-acetylene torch to heat the union coupling to dull cherry red then quenched it with a cup of water. The first coupling I tried this on came loose readily and after one revolution with the wrench [and plenty of time to cool], I removed it the rest of the way with torque applied by [gloved] hand.

The second union would not budge in spite of this treatment. Between heatings, I hit it with a hammer but it remained fast. Consequently, we used a cutting torch which emits a jet of oxygen gas through a flame and oxidizes steel. A short segment of pipe was cut between the union and a T-coupling thus freeing the main pipe along the boiler.

With unions apart, the cylinder cock pipe would come loose with the removal of a single bracket fastened to the boiler shell via a threaded stud. Unfortunately, the nut holding the bracket was corroded and could not be made to turn without heat. After heating to cherry red, a wrench with torque magnified by a breaker bar [steel pipe] eventually made some progress. Or we thought it was progress. After about half a revolution the nut fell off revealing we had literally torqued the 1/2" stud apart and never moved the nut. One more thing to fix on the boiler...

I manned the portable crane and we pulled the pipes off the boiler then applied a set of labels so they can be replaced or new copies made.

To reach the top of the boiler, the next step was to replace air cylinder brackets. These are large castings [weighing probably 50 or 60 pounds each] that suspend two large air cylinders along the locomotive, one on each side, and support running boards.

A&WP 290 in New Georgia Railroad's Pullman Yard

Again, I manned the crane lifting the castings up to the boiler while Dale aligned them with 1.5" diameter threaded studs. These studs were placed in the boiler from within when it was built and cannot be easily replaced without potentially compromising the boiler's integrity [not to mention incurring a laborious and detailed inspection]. The first bracket went on without much trouble but we discovered it was mis-labeled and didn't sit flush with the boiler shell. The next bracket didn't slide onto the studs at all, and we discovered the reason was due to the forward stud bent up and forward. The studs needed to be parallel for the bracket to slide on, and this clearly wouldn't do.

To straighten it, Dale fitted a large 12' steel pipe over the stud, and I climbed up on a scaffold. The plan was for me to bend the stud using the pipe as a lever. Doing this proved problematic; the pipe was not adequately stiff, and all force I applied deformed the pipe like a spring. I had visions of losing my grip and the pipe springing back; the last thing I would see is a 2.5" steel pipe swinging back to hit me in the face...

We abandoned this idea pretty quickly. When Dale attached a nut to the end of the stud and hit it with a sledgehammer without any success, he decided heat was the only solution. Again, the rosebud torch came out and in no time, the base of the stud was glowing cherry red. Each gentle stroke of the sledgehammer moved the end of it by about 1/4", and in no time we had straightened it.

Both brackets slid on without difficulty after this, and now we'll be able to add running boards and access the top of the boiler. Progress!

Wednesday, January 5, 2011

My Struggles in Trigonometry Class

During a period of restlessness last night, while lying in bed I recalled a project I completed during 11th grade trigonometry class in high school. Today, I don't remember the minimum scope requirements, but I chose to satisfy them by building a game that simulated artillery. The game environment consisted of a simulated island (a 3D heightfield) and two textured stone bunkers corresponding to each player's location.

In another window with GUI controls, the player submitted elevation, azimuth, and velocity of an artillery shell. Then, they would click FIRE and a shell would launch from their location, arc over the simulated island, and impact somewhere. If it impacted the other player's bunker (or their own), the player who's bunker survived received a point and the game restarted with new random locations. The camera was attached to the shell so the firing player would have a chance to observe the island, reconnoiter their opponents location, and update their trajectory. Differences in elevation between the two players' bunkers and physical obstructions such as the island itself meant each player needed to determine a unique trajectory to the other. I believe there was also simulated wind.

I spent plenty of time working on it and made it finally ready to present the day all projects were due. No one else did anything related to computing, let alone implement a real-time 3D renderer with a game built around it. The morning before class, I installed it on the antiquated classroom PC to test it out. To my horror, the low max-resolution of the video card prevented the GUI controls from appearing; the input window was obscured and not sizable. The controls for entering artillery parameters and firing the shell were out of view. Our teacher was perpetually suspicious of technology and told me I would have to submit it as is; no deadlines would be granted. I think she may have even been slightly pleased that smarty-pants Andykerr would finally be hoist with his own petard (when taken literally, 'sent into the air by one's own powder keg' -- a remarkably delightful metaphor given the context of my game).

I had until sixth period, classtime, to resolve the issue and make the game playable. Unfortunately, I did not have the source code, and even if I did, the machines I had access to with Visual Studio lacked the DirectX software development kit. Recompiling it was impossible.

So I edited the binary instead.

Windows applications store layouts of controls as resource files embedded in the executable. These are binary also, but I knew enough to expect the X and Y coordinates of the critical controls to be stored as 16-bit unsigned integers. Using MSPaint, I took a screenshot of the application to determine what the locations of the buttons and textboxes actually were. I visited the electronics lab where computers with Visual Studio were installed, and I used Visual Studio's hex editor to search and edit the executable. Fortunately for me, the coordinates made unique binary sequences for that particular application, so finding them was straightforward. I made up new values, overwrote the old ones, and hoped.

I brought it Trigonometry during sixth period, installed it on the classroom PC, and was met with pleased astonishment. The controls were in new positions, visible, and the game could be played as designed. The textures were crappy, and the slow video card made gameplay uninteresting, but it ran and I received full credit.

3D games rely heavily on trigonometric relationships to transform and project mathematical representations of a virtual world onto a raster display, but the math was by *far* not the most difficult part of that project. I don't think anyone else appreciated what I went through for that damn grade...