The Embedded Figures Test page provides a test which I hope will be a good predictor of other, less repeatable or automatable tests of cognitive flexibility. It invites readers to take the test, and fill in a short questionaire. There are now 226 valid entries in the database, 26 of which are from readers who have done the test twice, with some stress-reducing exercises in between. This post examines the results to date. You can download the raw data and analysis program at the end of the post.



There are 226 valid entries out of 252. Valid entries have a Before score > 100ms. 26 of them provide Before and After values. At least 152 are geeks, as detected by looking for some geekly substrings in the Occupation field. There are quite a few blank Occupations, so the actual number of geeks could be even bigger. It’s nearly all male. So this is still a sample biased towards male geeks.

First a couple of things that haven’t worked out yet, then the more interesting stuff. Here’s the distribution of respondents across 1000ms bands:



It’s a (pointy) Poisson distribution. Maybe a stingray :-) There is no evidence of the “double hump” I’d hoped to see, which might directly reveal the presence of two distinct strategies. However, the geekly bias of the sample might mean that it doesn’t contain the second hump yet.

Next, feeling nauseous when bored. Talking to many programmers over the years, I’ve got the impression that there’s a link between naturally gifted programmers and feeling nauseous when bored such as in long, tedious meetings. So I added a question to distinguish the the people who do feel nauseous (Nauseators) and those who don’t (Non-Nauseators).

Here are the two separate frequency distributions, to the same scale. There is no evidence of any correlation between feeling nauseous when bored and EFT score. If anything, it’s the non-nauseators who are more clustered into the 2 and 3 second band:





Next, age. Advancing age has a clear influence on score. This graph plots the average score and the standard deviation in each decade-long age band:



I think this matches what we know about cognitive tests in general. If it is going to be a useful workplace tool though, for example for qualifying Path. Lab. technicians before they perform tasks requiring good pattern recognition, we need to know if the evaluation should be age adjusted - or if the deteriorating scores do indicate an absolute reduction in pattern recognition ability. Perhaps there are some jobs best done by young eyes!

Now for the relationship between external stressors and score. I took each question, and scored it -2 for “Strongly Disagree”, -1 for “Disagree”, 1 for “Agree” and 2 for “Strongly Agree”. Then I compensated for negatively worded questions by multiplying their scores by -1. There are only 7 such questions, so each respondent can score between -14 and 14 of these normalized “Chill Points”. Mean and standard deviation for each chill band:



It does look like (if anything) there’s a tilt, top left to bottom right. It certainly isn’t tilted the other way. And as the respondents become chiller, the spread tends to narrow. Also interesting is that the effects of stressors are known to be greater when the subjects feel that they are not in control. The spread seems to widen quite suddenly as soon as the respondents’ perceive themselves as having nett negative chill. (I worded some of the questions negatively to stop people automatically clicking a happy or sad column, so this pattern is, I think, authentically emergent.)

Of course, we’re talking about gross external stressors here, rather than the fine grain establishment of positive self-confidence that I argue makes a big difference. But this graph is certainly enough to keep me interested in the EFT, particularly since the questions are a very crude probe of personal stress levels, and the conditions for the test are quite uncontrolled. There is no standardization of display size or cleanliness, mouse type, use of a mouse mat, lighting conditions, time of day, practice runs and so on. Better control of such factors might sharpen this pattern.

Also interesting, here’s a similar graph that only shows the entries with Before and After scores, organized by the number of destressing exercises the person did between the tests. The mean and standard deviation of the Before scores is shown in red, with the related After values next to it:



Curiously, even doing zero exercises improves the respondants’ scores, while doing 3 or more even benefits the Before scores :-) I suspect that what we are looking at here is an effect of time spent pondering the reasoning in the Introduction, or it might just be that some people misunderstood the test instructions. If they did the test twice, one straight after the other, we’d expect them to always do better the second time. If they then filled in the activities they normally engage it, the reduction of red and green scores with more stress reducers in play is interesting. Stressors make scores bigger, Destressors make them smaller.

You can download the raw data in file eft27jan2008.txt. The source code for the graph drawing program is EftAnalyzer.cpp. The program uses the wxWidgets graphical toolkit, which you should be able to download and build out of the box using any common OS and C++ compiler. The easiest way to get the analysis program building is to build the minimal sample, then cut and paste the source into minimal.cpp. It’s all in one file to facilitate this, and the existing minimal sample project files for the various compilers and IDEs will work just fine.