Nibbles, Bits, and Bytes

Monday, November 16, 2009

Branch-and-Bound, Graphical Models, Structured Grids

It never fails that Murphy will make an appearance right when I am getting close to wrapping things up. We have bandwidth limitations on internet usage in Australia. Do you remember using dial-up modems? That's about what it feels like when you use all of your allocated bandwidth for the month. Unfortunately, it made it nearly impossible to access the UIUC wiki page to get the papers for the final reading assignment. Here is my lackluster, yet final, submission for this reading blog...

Branch-and-Bound

The branch & bound pattern is fairly well written. Although the use of the data flow diagram did assist in explaining the pattern, it is unfortunate there were no actual code examples to demonstrate the branch and bound operations of branching, evaluation, bounding, and pruning. It was beneficial we had previously discussed the task parallelism, data parallelism, shared queue, and data distribution as they were used to demonstrate the parallelization approaches. I do appreciate the authors not getting to deep into the noise of the technical aspects and kept the pattern fairly high level for a well-rounded audience.

Graphical Models

Does this pattern have anything to do with parallelism? Ok, I understand the idea of modeling the probability distribution, but I do comprehend how this is applied to everyday use. What is the true benefit here? The example, if you call it an example, provided no real insight into what the pattern was about. I wanted to read and comment on the pattern before listening to the recorded discussion. Now, I am interested to hear the follow-on discussion after reading my classmates' comments.

Structured Grid Computational Pattern

This pattern was co-authored by an UIUC undergrad alum who went on to UC-Berkely to complete her PhD. The idea of representing a multi-dimension problem in the form of structured grid implemented as an n-dimensional array is very interesting. I do not have any background or experience in this area. As the array is broken into chunks for parallel processing, the tasks cannot be too fine-grained resulting in additional overhead and reducing efficiency of parallelism. The Geometric Decomposition pattern is handy in this situation.

I am interested in looking at additional information related to "Ghost" nodes, double-buffering, and adaptive mesh refinement.

PRES: Probabilistic Replay with Execution Sketching on Multiprocessors

PRES is a technique or system used for detecting, diagnosing, and reproducing a concurrency bug on multi-processors. I thought it was an interesting reading and was able to recognize similarities and differences between PRES and one our previous readings, CHESS.

In an ideal world, a programmer would love to reproduce a concurrency bug in a single replay and with no runtime overhead. PRES combines three ideas or techniques: (1) record only partial information to keep the replay close to the production run, (2) diagnose, process, and reconstruct unrecorded information to reproduce the bug, and (3) use feedback from unsuccessful replay runs to determine which paths do or do not produce the bug in question.

The various schemes of PRES are quite intriguing. I am not sure how thorough PRES is compared to CHESS. Even though CHESS is Microsoft-based, I would be interested in see a side-by-side comparison with identifying concurrency issues. Each system may be well on their way in building their own niche in the market. Considering the usefulness, efficiency, and feedback received from the concurrency debugging applications, I would definitely want to revisit this topic in a few years to see which approach reigns supreme.

Thursday, November 12, 2009

Shared Queue, Speculation, Circuits

Shared Queue Pattern

The shared queue pattern is found quite often in software applications involving the processing of multiple tasks. A centralized share queue is definitely one of the easier implementations. A common use of shared queue is with a thread pool. One concept is to allow the individual threads pull tasks directly from the queue. Another option is for a master thread to take responsibility for distributing the tasks to the threads for load balancing.

Although the examples in the paper used separate locks for put and take, I believe it is more common to use a single lock for the entire queue. Two locks might be overkill depending on the number threads working off the queue and the max size of the queue itself.

Speculation

I do wish the HTML lexer example was earlier in the paper. After getting through most of the paper, I recalled learning about speculation with branch prediction in a previous class. With branch prediction one path is speculatively chosen based on the past behavior of the program. This can be extended to speculatively execute both branches and terminate one of them when the result of the branch predicate is available.

I do want to go back and review this pattern again after the class discussion. I suspect there will be informative information discussed to help me gain a better understanding of this pattern.

Circuits Pattern

I can honestly say I had no idea what the Circuits Pattern entailed when reading the title. Even though this is not a Physics class, I thought it might have something to do with electronic circuits.

It is unfortunate I have not been professionally exposed to bit shifting as I might have had a more genuine appreciation for the examples of limited information presented in this paper. The md5 example was familiar and somewhat intriguing to see how it is really implemented. Looks like this might be another pattern I will revisit after the class discussion.

Wednesday, November 4, 2009

Armstrong Thesis (Chapter 6)

Chapter six of Joe Armstrong's thesis demonstrates the implementation of supervisors in the OTP system.

Believe it or not, hierarchies are used once again. I am not sure if it is an inferred concept of using Erlang or if it is a design decision by the developer. If you think about it, hierarchies are everywhere. There is always someone higher up the management ladder as you sit in your cubicle staring at the TPS cover sheets (eg. supervisors manage workers).

As with any language where I have not had a lot of hands-on use, I am still not very comfortable with the Erlang syntax. I am sure there are benefits of using Erlang that have not been expressed in the thesis. I think it is unfortunate the class will not be reading a few more chapters or appendix of the thesis. I do not have any real direct interest in learning about programming in Erlang, but the theories behind design choices can be applied to other system designs.

When looking at functionality, the supervisor and worker concept is similar to a thread manager being the parent of spawned child threads. The manager just sits in his office while the real workers do all the labor and satisfy any incomplete tasks. The child threads report back to the manager with status and the manager can re-spawn child threads if a thread happens to die.

Being one of the few thesis I have actually read voluntarily or involuntarily, it was an experience that may drive me to read more technical papers in order to stay knowledgeable on theories of design and development.

Pipeline, Geometric Decomposition, Data Parallelism

Pipeline is probably one of the more popular patterns used by a wide range of people most likely not recognizing they are actually using a synchronous pattern. One of the basic common methods of pipeline is in Unix scripting when users are "grepping" for certain characters in a file or filtering processes running on a server.

In addition to shell scripting, I have experience using pipelines in vector processing and signal processing. Various programs have used a message-passing environment if timeliness was a requirement. Otherwise, network file systems were very common as the concept was more dependable in case a stage of the pipeline failed.

The geometric decomposition pattern involves decomposing a data structure into chunks and processing the chunks in parallel. Designs for this apttern fall into four key elements: data decomposition, data exchange, update operation, and data distribution and task schedule. I would be interested in seeing more of this pattern to get a better understanding of how it can be applied to benefit my work.

Regarding the data parallelism pattern, I would tell you more if there was actual content worthy of commenting on. This pattern is very much in the early stages.

Tuesday, November 3, 2009

Task Parallelism, Recursive Splitting, Discrete Event

I thought had an inkling of an idea of what the Task Parallelism pattern. Boy, was I wrong. After reading the context a couple of times, it was the forces section that helped shed some light on what was actually being presented. Unfortunately, the examples of Monte Carlo simulations on a cluster of computers did not help solidify the pattern being presented. Perhaps a more concrete code example would have been more beneficial. I will be interested in hearing the discussion on this topic.

The Recursive Splitting pattern felt a bit like revisiting the fork/join pattern with a little twist. I am not sure how many other patterns take load balancing into account when determining the number of recursive splits to use. Too many or too few splits are of little benefit to the overall speed of the algorithm.

The Discrete Event pattern presents a method for trying to provide some form of order when dealing with independent tasks. Pessimistic approaches ensure the events are always executed in order at the expense of increased latency and communication overhead. I agree with the author in believing a deadlock timeout is more successful than deadlock detection. I also prefer message passing as a means for communication in systems.

Monday, November 2, 2009

Armstrong Thesis (Chapter 5)

This chapter highlights fault-tolerant systems and how to achieve such success. It is difficult to be able to identify every possible circumstance the system will experience and how to handle the various circumstances.

The philosophy behind fault-tolerant systems is "if we cannot do what we want to do, then try to do something simpler." The thought involves creating a hierarchy of tasks being overseen by supervisors. If the highest level task cannot be performed by a worker process, then the supervisor process initiates a an error recovery procedure and will likely move down the ladder to a lower level task. The system will fail if you get to the ground with the simplest task of all.

The concept of AND and OR nodes reminds me of logic instituted on one of my past development programs. There was a parent or system executive process which monitored all other children processes. If the child process exited with a zero, then it was shutdown on purpose by a user. If the child process exited a value less than zero, the system executive had the ability to restart the child process if the configuration file deemed that action necessary. The concept came in handy when a process would get in a certain state and create a core file. The process is restarted with little interruption and developers could review the stack trace and core file for specific evidence.

I have not been exposed to the term "well-behaved function". Armstrong defines the WBF as a function which should not normally generate an exception. If an exception occurs which cannot be handled and corrected, the function should terminate with an exit statement. Notice the previous statement does not say the application should exit, just the function. Based on the four rules presented for WBFs, I must not fully understand how this approach is different than any other thoroughly thought out function in C++ or Java with try/catch blocks and throwing exceptions.