Saturday, February 28, 2009

Turing Test Take Two

One concept that Alan Turing is famous for is his test for evaluating artificial intelligence, know as the Turing Test. In the test, people attempt to identify by means of a conversation which "far end" conversationalists are people and which are computers.

I would propose an alternate version of the Turing test: the artificial intelligence side runs the source code through a tool and an engineer (fairly literally) implements the tool's recommended changes. The "control group" would be one or more experienced engineers who do a peer review of the same code and implements their improvements. If an observer cannot tell which code "improvements" were the result of the machine and which were the result of good engineering judgment, the tool has passed the Turing Test Take Two.

Conversely, if a tool cannot pass the Turing test, we must use an experienced engineer to filter the "recommendations" that the tool makes before we apply it.


Case study


In our contracts and in our work instructions, we have implicitly made our tools the "gatekeeper" and final judge of our code quality. The way that we fall into this trap is via our contracts or Plan for Software Aspects of Certification (PSAC), we specify that we will provide the artifacts that are generated by our tools to proved that our code is "good." The result is that the goal of running the tool is no longer to produce good code, but rather it is to produce clean printouts ("no faults found") for the customer.

The way back from this madness is to make engineers responsible for the code they write and the code they review. They should use tools to help them write good code and perform quality reviews, but the artifact that we take to the customer should not be "PCLint signed off on this code written by an anonymous cog" but "Gerald Van Baren wrote this code and is proud of it" and "Joe Competent Engineer reviewed this code and agrees that it is good." In other words, our engineers must taste the sausage. (In that article, map leaders => experienced engineers (aka. gatekeepers), sausage => code, broken machines that result in overtime => broken or misapplied tools that result in overtime.)

Our C Coding Standard (an internal standard consisting mostly of MISRA-C rules) is a classic example of a tool gone wrong. We sowed the seeds of a Turing Test Take Two breakdown in Rule 4 (of the internal standard): "Source code shall be run though an acceptable static source code analyzer." When we write in our PSAC that we will follow our C Coding Standard, we just jumped the shark and never saw it coming. While Rule 4 does not explicitly state that the engineer must implement the tool's "recommendations,"1 in practice it is easier to make the tool shut up than it is to explain and defend and defend and defend good engineering judgment that is contrary to the tool's "recommendation."

The case study in creating an "acceptable static source code analyzer" is the (first try internal C Coding Standard) checking tool. We spent (lots of money) contracting (elided) to implement it and then discarded it because it was hopelessly inadequate. We followed that by spending (a tenth as much money) on the (second try internal C Coding Standard) tool which was only moderately inadequate. We are now mainly using PCLint (thousands of dollars per seat) which is almost adequate, but is still incapable of passing the Turing Test Take Two.

We actually (inadvertently) did the Turing Test Take Two on the (first try internal C Coding Standard) and (second try internal C Coding Standard) tools: we assigned engineers to implement changes to project source code in a "sea of hands" fashion on the results of running the static analysis tool on that code. That was a disaster. Management quickly realized from the howling of anguish from the affected internal engineers that it wasn't working and backed off on that approach.



  1. When I discussed the (first try internal C Coding Standard) tool with an experienced, highly regarded engineer, he told me he ran the (first try internal C Coding Standard) tool on his code because the PSAC said he had to. He noted that the PSAC had a [X] checkbox for running the checking tool, but did not have a check box that said that the results were used for anything, so he did an incredibly practical thing: he simply discarded the verbosely bogus results. He then ran PCLint on his code, using it as a tool (not a judge), to identify problem areas in his code and applied his engineering judgment to determine which complaints were real and which were artificial.

Intellectual. Property.

Or, as our British friends would say, "intellectual (full stop) property." Management has subscribed to the term "intellectual property" to allow them to indulge in their fantasy that the company's "intellectual property" can be transferred readily to a "low cost labor region." If the "intellectual" part is embodied in employees, it is not very portable, which really balls up their plans to save money by transferring their "intellectual property" to "low cost labor regions."

I contend that intelligence is real and it does create property for companies, but the property (code, documentation, manufacturing, etc.) is an artifact of intelligence. It isn't intelligence itself and thus "intellectual property" is a jarring discord. Toner on paper does not have any ability to be intelligent. Bits on a disk do not have any ability to be intelligent.

The property half is held inside the company walls, but the intelligence half walks out the door every evening.

Management can talk all they want about the company's "intellectual property", but disks full of bits have only a residual value without the human intelligence that understands what those bits mean.

As an aside, if the intelligence half did not walk in the company door some morning, it does not lessen the value of the property the company actually owns. Management is inappropriately taking credit for the intelligence that humans bring to the business.

Transferring "intellectual property" to "low cost labor regions" is an extremely short term and an extremely destructive strategy. The fundamental reason some areas are "low cost" is because the workforce there often is inexperienced in general and always is inexperienced in the problem domain of the "intellectual property." By the time the "low cost labor region" workforce has gained the domain knowledge to effectively use the "property" part (the bits on the hard drives), they will no longer be low cost. Aggravating this, there is a substantial risk that the intrinsic value of the the bits on the hard drives will have decayed to a level that no amount of added intelligence can restore the value back to its original level.

As a concrete example, how much better are the "big three" domestic automakers doing after transferring their "intellectual property" (knowledge of how to build automobiles) to "low cost labor regions?" I contend it nearly killed them. Maybe it has killed them and what we now see is their death throes.

In an article on IBM layoffs, Robert E. Kennedy, a professor at the University of Michigan's Ross School of Business states "GM is stuck with high-cost, medium-skilled engineers. That's one reason it takes GM seven yearsto go from concept to design to the showroom floor [to produce a new car model],whereas it takes Toyota only three years. If they were more into tapping into the best talent wherever it is in the world, GM would be in better shape today."

This is a completely specious argument. Toyota hasn't moved their engineering to "low cost labor regions." Instead, they have worked to make their engineers higher skilled and more efficient. Dr. Kennedy's quote cuts exactly opposite of what he intended to show: it states that offshoring is killing GM. The engineers that remain are the second rate ones... by implication, the best ones have all left.

My google searches indicate that Toyota created new technology centers in...
  • Ann Arbor, MI - Detroit's back yard. Not a "low cost labor region." If there truly are no good engineers in Detroit, it would follow that they wouldn't be in Ann Arbor either.
  • Cannes, France - one of the most expensive countries in the EU.
  • Australia - I don't know how this compares, but it definitely isn't the low cost labor region in that region.

Wednesday, February 25, 2009

First post

When it is your own blog, it is easy to score "first post!"