Multi-core scaling of the Revit Database

by Guy Robinson 1. February 2010 13:40

On a fairly regular basis I’m asked questions along these lines:

  • “We’ve been asking for multi-threaded Revit for years. How hard can it be? <Insert usual ADSK expletive>…”(1)
  • “Why doesn’t Revit utilise all my PC’s cores like <insert application>…”
  • “Revit isn’t a real database, and it’s slow according to our DB expert…(2)”
  • “If ArchiCAD can use multiple cores then there should be no reason Revit can’t do it.(3)”

These questions and Intel’s press release on their 48 core processor has prompted this post in an attempt to explain why I believe it is a difficult Software engineering project to make Revit performance scale with multiple cores.(4)

Do I think we’ll get a measure of scaling eventually? Yes. But it will probably happen progressively and with caution.

What is a database?

I’ll leave it to Wikipedia to describe in detail. I’m going to deliberately try and explain the problem without using computer science . However ;-), to set the scene…

Simplistically, a database comprises 2 major areas of functionality.

Transactional Storage. The ACID properties of the distributed Revit database and other aspects of database storage I don’t intend to discuss.

Querying. This is the aspect of the Revit database that effects the user experience the most when it comes to performance. Querying is used by an application for reading and writing/ updating of a database.

It is the querying of the Revit database I’m going to discuss in this post.

Setting the scene

All the features Revit users take for granted such as :

  • A window moving with it’s host wall when the wall is moved.
  • A wall locked to a grid moving with the location change of the grid.
  • A view title on a sheet updating when the view is renamed

They all are based around a single principle of a defined relationship. Take for example a window hosted by a wall. This can be represented as such:

RevitRel

Keep this in mind as I build up a picture of the problem Autodesk have with querying the Revit database with a multi-core architecture.

Doing some work

Say you have some work to do. There are 5 discreet tasks that can be represented as follows:

 FiveTasks

At the moment you are on your own, so you start with one and after finishing that you move onto the next one. But it’s too slow. The boss wants to speed things up so he brings in more people to each do a discreet task (5):

 FiveTasksFiveWorker

The boss is happy but he notes four of the workers are sitting there twiddling their thumbs waiting for the slowest task to finish so they can go home. So he splits that slow task up into discreet subtasks and gives the other workers one subtask each to do:

 FiveTasksSubFiveWorker

There’s some extra work required combining the discreet subtasks back together in a coordinated way which costs some time but overall the tasks are all completed in considerably less time.(6) Everyone goes home happy…

The Revit taskforce

The boss gives the workers some new tasks. However, these tasks are no longer discreet, each task depends on some aspect of another task:

 FievRelTasks

In other words each task has a relationship to another task.

So no task can be completed independently by a worker without coordinating and communicating with another worker. This coordinating takes planning,time, and a task may not be completed if the dependent task fails. More critically, throwing more workers at the tasks may not have the desired effect because they’ll spend more time communicating than doing any work on the task.

This in a nutshell is the problem for Autodesk with Revit. Except it looks more like this:

 RevitTF

Each task has multiple relationships to other tasks . So where do Autodesk start when throwing more workers at the tasks? Critically you don’t want to be doing a task more than once. And some tasks will be a priority in some contexts and not others. Revit has to manage this and work out the best route to finishing updating a project as quickly as possible. Put simply this is a massive mathematical and computer science project on many levels.

Do Autodesk have the resources to solve this problem? Yes.

Will they succeed? Sure.

Will it happen overnight? I doubt it.

Will they achieve a linear scaling of performance for n-cores? No, for reasons that I described above.

So the next time someone asks, I hope this post goes someway to helping you explain why it’s so difficult for Autodesk to scale Revit for multi-core CPU architecture.

Notes:

(1) Revit is multi-threaded but it doesn’t scale linearly with multi-cores across all areas of functionality. Some functionality and rendering will use multiple cores but the improvement will be barely noticeable for now ( with the exception of rendering).

(2) Revit is not built on a relational database object model. More like a object database model with some similarities to source code systems like CVS for replicating changes.

(3) The details would take another post but I don’t consider archiCAD a true BIM application. It took Revit for people to realise Buildings were designed by multi-disciplined teams and therefore a coordinated Building model for ALL disciplines would be the game changer that we now think of as BIM. One of the core differences between Revit and archiCAD can be explained by the principles discussed in this post.

(4) I have no access to the source code (obviously), I have had no discussions with the factory on this post. But I think I have an understanding of the issues they’re facing.

If you’re part of the DB team in the factory, please don’t laugh too loudly ;-)

(5) This is equivalent to raytrace rendering, video encoding etc where each thread can handle a discreet task and therefore you get close to n-core scaling.

(6) Simplistically this is what we get now with some aspects of Revit functionality.

Comments (7) -

N. Gordon
N. Gordon United States
2/4/2010 12:13:30 AM #

Here's the thing. Everyone *KNOWS* multi-threaded programming is hard. I'm a programmer, and I know it. You can ask some of the top programmers in the world what they consider the most difficult programming concept or skill that any person could possibly even HOPE to grasp, and they will tell you it is threading.

What your post mainly refers to is the concept of a race-condition (Wikipedia has a great overview of the concept here: http://en.wikipedia.org/wiki/Race_condition). Two threads cannot be attempting to modify the same data simultaneously, or you have problems that are incredibly difficult to reproduce or even discover. This causes a breakdown in the traditional debugging cycle for most people, where you write code, test code, and call it good. With a multi-threaded environment, you have to be *incredibly* confident that you have considered every possible condition that could arise before you write a single piece of code!

So what's the problem here? ADesk has lots and lots of qualified programmers, but to get this to work (it has very little to do with the database structure) LOTS and LOTS and LOTS of code has to be completely scrapped, re-designed, re-structured, heavily tested, and that takes Money (with a capital M) and time.

Then... as you have already addressed, performance will not scale linearly with the number of cores for an application like Revit in normal use cases. For things like rendering it will, absolutely. In fact, for most average uses, performance will be affected minimally, and the increase will reach a plateau very quickly. At a certain point, the payoff from engaging more cores to complete tasks approaches zero. The rate at which it approaches zero is proportional to the relative similarity and independence of each task split between cores. As you said above, Revit has to perform a lot of processing based on relationships and constraints between objects, which equals very little independence. Rendering... easy! You're doing the same thing to every pixel on the screen.

In summation, there's not going to be a HUGE performance boost in spending time on incorporating multi-threaded code into Revit. I personally think the time would be better spent optimizing the code that already exists.  With multi-threaded code being included in Revit, the thing that is more likely to happen is more crashes, more bugs, and more instability. That is not a jab at the wonderful programmers that work at ADesk, its a fact. Any programmer that pretends to know 100% what they're doing when it comes to threaded applications is either too naive to understand the problems or a complete liar. Or an alien.

Guy Robinson
Guy Robinson
2/4/2010 1:59:03 AM #

@ N.Gordon,

Thanks for the comment. What I've described is not just about a race-condition. That's just one aspect of threaded programming. This post is all about describing the domain model used by Revit.

You're right, threaded programming is very difficult, but it's not impossible... As I said Revit is already multi-threaded.

The CPU manufacturers have made it clear multi-core is the way of the future. If Revit is ever to scale in the future they need to figure out how to do it. I'm confident the existing code is already fairly optimised, Revit's been around a while...

DaveP
DaveP United States
2/6/2010 2:21:25 AM #

One other explanation I’ve heard uses an analogy of a math problem.
If you are trying to get the answer to 100 addition problems, a class full of 5th graders is going to be faster than one person with a Math PhD.
However, if you are trying to answer a Differential Equation, the 5th graders aren’t going to help much.
Revit is more like the Differential Equation.

Walt Sorensen
Walt Sorensen United States
2/6/2010 7:23:56 AM #

Since the major lag point in revit revolves around database queries that are all needing to happen at the same time for many objects we can safely say that 8 threads on an i7 will help but will not be game changing.

what we need is a plugin that will push the database quarrying in revit to a workstation graphic processor with CUDA or Stream technology. I can see a big potential performance increase using a desktop supercomputer with 4 nvidia tesla cards or similar firestream cards. Check out what a tesla card can do www.nvidia.com/.../tesla_computing_solutions.html

specifically in the data mining of databases area and rendering.....www.nvidia.com/.../...ning_analytics_database.html

(I would put something about ATI's firestream but ATI doesn't publish the data like Nvidia does)

Guy Robinson
Guy Robinson
2/6/2010 10:56:51 AM #

@Walt
Walt, it's not about database queries all needing to happen at the same time, it's that objects have (many) relationships to other objects so discreet querying doesn't work. You can't think like a relational database. So i7' are not a game changer.

I doubt nVidia tesla cards would help at all. It's an algorithm problem rather than specifically a hardware problem.

D. Raynor
D. Raynor United States
2/7/2010 12:19:49 AM #

When people are waiting for Revit to move an object and task manager is telling them they are using less than 10% of the processor power they get frustrated. If Revit could make the task manager show 100% while they are waiting, they would be happy.

Walt Sorensen
Walt Sorensen United States
2/7/2010 1:47:00 AM #

@Guy, I'm not thinking about it as a rational database or linear database, I'm thinking about it like an object-relationship database. Every time an object is viewed, moved, or modified in revit; all the objects that are interconnected or have a relationship to the original object are also accessed at a minimum, along with all the objects that are interconnected or have a relationship to all the objects that are interconnected or have a relationship to the original object; and so forth. (as per your explanation and graphics) That is a lot of datamining and quarrying of the revit database to pull up the specifications for all those objects, and make changes as needed. Anything that would speed up that repeated database access would help.

I do agree that part of the problem with revit is the algorithms, but some hardware like the nVidia Tesla (with a custom plugin) would process the algorithms faster. Matlab is an algorithm intensive program and has a plugin that accelerates algorithm processing with nVidia Tesla cards. I will agree currently nVidia Tesla cards will not help as there is no plugin for revit. I will also mention that Tesla cards and firestream cards are designed for engineering but their use id not being demanded yet for products like revit.

Pingbacks and trackbacks (2)+

Comments are closed

About the Author

A .NET software Developer providing custom applications and commands for architecture firms exclusively working with Autodesk Revit and integration with any associated applications. All from a little place north of Whitianga, New Zealand.

Page List

Disclaimer

I'm self employed so the opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway☺

© Copyright2008

Creative Commons License
Blog content is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

With the following exception. All code snippets, application and libraries are licensed under a a Apache License Version 2.0

Autodesk Revit®

Autodesk: Revit is a product that is wholly owned by Autodesk. Any reference to Revit,Revit API, Revit Architecture, Revit MEP or Revit Structure on this site is made acknowledging this ownership. Refer to Autodesk's own web site and product pages for specific trademark and copyright information. Autodesk represents a great many products and every attempt will be made to respect their ownership whenever one of these other products is mentioned on this site.