Customers often ask us what we recommend in terms of computer specifications for running our software.

Before answering, it’s worth noting of a couple of key points:

  1. Based on independent performance comparisons, our software should out-perform the competition on equivalent hardware. Another way of saying that is you could use cheaper hardware with our software and still get better performance than the competition.

  2. Compared to staff time and software costs, PCs are cheap. A brand new PC only has to increase an employee’s productivity by a few percent compared to their current PC to justify its purchase price. Therefore saving a few hundred dollars on computer equipment may cost more money in the long run due to lower productivity.

To summarise: our software will be the fastest on any particular hardware, but it may be possible to justify buying more expensive hardware with a simple cost:benefit analysis, taking into account the increased productivity that the better hardware will allow.

So, what components make the most difference to the user’s productivity? In this first article I’ll focus on the most important component, the CPU. Later articles will look at the graphics card, the amount of memory you have, and the value of 64 bit operating systems.

CPU Speed

Automatic relative-only point generation, the resection, the bundle adjustment, and DTM generation all depend on CPU speed. (If you’ve only ever done small projects then you may not think of the resection or bundle adjustment as being a big issue, but when you have a thousand images in a project they can both take a significant amount of time.)

All will also get faster to varying degrees by using more CPU cores, although how much faster the process will be overall tends to diminish as more cores are added (i.e. four cores are not twice as fast as two cores, and two cores are not twice as fast as one core) thanks to something called “Amdahl’s Law“.

Even worse, since the latest generation of CPUs actually run the cores slower when more of them are active at the same time, adding more cores can actually make some operations take longer.

Nevertheless, the current “state-of-the-art” CPUs to use with 3DM Analyst Mine Mapping Suite are Intel’s current generation of i7 CPUs, and the higher the clockspeed and the greater the number of cores, the better.

To give some concrete numbers, consider the following movie from our YouTube channel, showing how to process a couple of images from scratch to generate a DTM:

That video was recorded on a 3 GHz Intel Quad Core QX9650 CPU, the flagship of Intel’s previous CPU generation. As you can see, it managed to generate 5,870 points/second. The flagship of the current generation was the 3.33 GHz Intel Core i7 975 (there is now a six-core version at the same clockspeed called the “980X”) and it uses the aforementioned “TurboBoost” technology, so let’s see how fast it is using different numbers of threads:

CPU Threads Points/second Relative Performance
Intel Core i7 975 4 10,156 186%
Intel Core i7 975 2 7,979 146%
Intel Core i7 975 1 5,462 100%

The first thing to note is that this CPU is about 73% faster than the previous generation when both are using four threads. Part of that is the higher clockspeed, but that only accounts for about 15% of the improvement. A much bigger contributor is the improved memory architecture in the current generation.

As mentioned before, however, doubling the number of threads gives much less than twice as much performance. In fact, even four threads are less than twice as fast as a single thread, partly because the clockspeed drops from 3.6 GHz in single-thread-mode down to 3.33 GHz with four threads active, but mostly because of Amdahl’s Law.

We can actually side-step Amdahl’s Law, however, if we use DTM Generator. The problem is that when processing a single project, some portions of the process are inherently serial — they cannot be easily split between independent threads of execution, and so the additional cores are waiting around for a single core to process the serial portion. If we eliminate the contention by instead processing multiple projects, then we can achieve much higher rates of throughput, and this is exactly what DTM Generator does:

CPU Threads Points/second Relative Performance
Intel Core i7 975 8 26,602 463%
Intel Core i7 975 4 18,298 319%
Intel Core i7 975 2 10,924 190%
Intel Core i7 975 1 5,741 100%

(DTM Generator supports up to eight threads; the i7 975 is a quad-core CPU but it has hyperthreading, which means it emulates an eight-core CPU by having each physical core execute two threads simultaneously, giving about a 45% speedup over four threads in this case.)

This time the speed scales more linearly with the number of cores being used, taking into account the reduction in clockspeed as more cores are brought online. Processing just two projects in parallel, using just one core for each, already gives higher throughput than trying to use all four cores to process just one project.

The difference, of course, is that the time it takes to complete a single project is less in the first case; if you’re sitting waiting for a project to be processed then it’s better to throw all your computing resources at that project even if it is less efficient. If you’re batch processing them, however, then maximising throughput is a better option.

Relative Cost

Increasing the clockspeed improves the performance. Increasing the number of cores improves the performance. Using a newer version of the CPU (Intel i7 vs Intel Core 2) improves the performance. But all of these improvements also increase the cost.

Right now, from our local PC supplier, the current flagship Intel i7 980X CPU (six cores, 3.33 GHz) costs $1,397 Australian dollars, including GST. The next model down, the i7 960 CPU (four cores, 3.20 GHz) costs just $682, and the i7 930 (four cores, 2.8 GHz) costs only $339.

The 980X is at most 56% faster than the 960. (50% more cores, 4% more clockspeed.) But it costs more than double. Is it worth it?

It turns out that this is not actually a simple question to answer. Firstly, it’s misleading to look at the ratios of CPU prices in isolation — the CPU will affect the performance of the whole computer, so a 56% faster CPU translates into a 56% faster computer. If we assume that the rest of the computer costs, say, $2,000, then the price ratio becomes $3,397/$2,682 = 127%. 56% faster for 27% more cost looks like a good deal.

But we can go further than that — if the computer is faster, then the user is more productive. Adding in their salary reduces the apparent cost of the CPU even more.

However, there are caveats to this — firstly, the CPU is at most 56% faster, as I said. In general it will be far less, because, as we saw above, performance does not scale linearly with the number of cores, and 50% of the increase was because the more expensive CPU had six cores rather than four.

Not only that, but much of the time the user is sitting at the computer, the CPU is idle. If you go back to the YouTube video again, you’ll see that the entire clip is nearly 1.5 minutes long, but the CPU was only working for 1/3 of that time. The rest of the time the computer was waiting for the user. So if we doubled the computer’s performance, we’ll only reduce the time taken from 1.5 minutes to 1.25 minutes, an improvement of just 20%. Furthermore, once you have the DTM, chances are you’re going to start actually mapping it, and that can easily take tens of minutes.

So the actual increase in productivity for the user overall is going to be far less than the increase implied by the difference in CPU performance. The calculation that really makes sense is the staff member’s actual productivity increase overall relative to the overall cost increase (staff member’s salary + computer cost + marginal cost of the faster CPU).

Taking all of that into account, we tend to purchase higher-end CPUs to maximise staff productivity. Our current PCs all use the Intel i7 975 CPUs, which, at the time we purchased them, cost the same as the 980X CPU does now, and the $650 absolute difference in cost (no GST) was easily justified.

Desktop or Workstation?

CPU vendors typically split their CPU lines into “consumer” and “workstation” varieties. Intel call their consumer CPUs “Core i7” and their workstation CPUs “Xeon”. The current generation of both use the “Nehalem” architecture. The “UP Server” (“Uniprocessor”, i.e. a single CPU) Xeons are identical to the Core i7 versions with the same clockspeed. So a Xeon W3580, for example, is actually the same CPU as the Core i7 Extreme 975, and Intel even sell them for the same price. The “DP Server” (“Dual Processor”) versions have additional interfaces to allow more than one CPU to be used in the same computer (2 x QPI for the DP Server rather than 1 x QPI for the Core i7 and UP Server CPUs) and Intel charge more for those CPUs even though, by themselves, they perform no differently than either the UP Xeons or the Core i7 CPUs. Since even one of these CPUs will have four cores and eight virtual cores — more than enough for our software — there’s really no point building a workstation capable of having more than two CPUs in it, and therefore paying the premium for the DP Server Xeon chips is a waste. If the system you are buying is using a UP Xeon (look at the models under “Processor No.” in that table) then make sure you’re not paying a premium for the label vs a Core i7 CPU because you aren’t getting anything extra for your money and Intel don’t charge extra for UP Xeons.

Conclusion

Faster CPUs and more cores certainly make the software faster. How much more productive they make the user, however, depends on what percentage of their time they spend waiting for the computer. If they are fairly intensive computer users then we would suggest the marginal cost of a high-end CPU is relatively insignificant in the overall scheme of things, and only needs to increase their productivity by a tiny amount to be worthwhile.

Currently, the fastest CPU on the market for our software is the Intel i7 980X. Using a dual-processor (or even quad-processor) Xeon workstation instead would be much harder to justify because of the increased cost and the dimished returns.