Bugzilla – Bug 224
[asmparser] Really slow parsing of types with complex upreferences
Last modified: 2004-02-09 15:15:15
You need to log in before you can comment on or make changes to this bug.
Parsing the make_dparser.linked.rll file takes a _long_ time. Though the file is 10M of .ll file, it takes almost a minute to parse, spending almost all of its time in the asmparser's HandleUpRefs function. This is bad. -Chris
These two patches speed up llvm-as on the testcase from 61 -> 22s. http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040202/011358.html http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040202/011359.html -Chris
FWIW, the rest of the slowness in the bug is due to this loop in Type.cpp: // Now we check to see if there is an existing entry in the table which is // structurally identical to the newly refined type. If so, this type // gets refined to the pre-existing type. // for (iterator I = Map.begin(), E = Map.end(); I != E; ++I) if (TypesEqual(Ty, I->second)) { assert(Ty->isAbstract() && "Replacing a non-abstract type?"); TypeClass *NewTy = cast<TypeClass>((Type*)I->second.get()); // Refined to a different type altogether? Ty->refineAbstractTypeTo(NewTy); return; } In the common case, this loop must call TypesEqual (a _very_ expensive function) on EVERY SINGLE type of a particular flavor (e.g. StructType) created in the program. I've long known of this horrible code, but haven't had a wonderful idea of how to fix it. Perhaps sleeping on it will help. :) -Chris
I don't know *anything* about the problem you're facing, but just staring at the code I do have a suggestion. Since you're comparing the type for equivalency against I->second, I assume the key value of the map isn't related to the type. Perhaps you need a new map, one that is keyed by "type equivalency" or some close approximation that is cheap to compute. If you had a map or hashmap that was keyed by this "type equivalency" your search would at worst be narrowed down to just a few items and at best would be a simple binary search or tree walk to the type of interest. Here's one approach. For each type encountered, compute a hash value that factors various aspects of the type into the hash number: basic type of type, number of elements, types of elements, etc. If you have bytecode for the type, just add up the bytecodes. Place each type into a hash map using this hash value. This would at least cut the search space down by orders of magnitude and reduce the number of comparisons you have to do. Another approach, a little harder, is to produce a perfect ordering function for types. That is, all equivalent types produce the same ordering value and no non-equivalent types produce the same ordering value. Computing that might be as expensive as TypesEqual, but you would only have to compute it once per type and then use it as the key for a map. Again, even if you couldn't make it a perfect ordering function, you'd at least reduce the search space quickly (by computing just the one ordering function call on Ty and then using lowerbound and upperbound on the map). My $0.02 worth.
Sounds good, that's exactly what I'll do! :)
Ok, so this patch implements the "hashing scheme" for types in an attempt to reduce the number of TypeEquals calls that have to be made. This patch does indeed reduce the number of calls, from this: 740332 bytecodewriter - Number of bytecode bytes written 61946 type - num types actually equal 8643646 type - num typeequals calls 104010 type - num slow types To this: 740332 bytecodewriter - Number of bytecode bytes written 61946 type - num types actually equal 6328911 type - num typeequals calls 104010 type - num slow types But this really doesn't speed up llvm-as _at all_. The problem is that, due to the nature of the problem, the hashing scheme cannot look at anymore more than the local type itself. For structure types, our main problem, this means that we can only look at the number of elements in the structure for our hash. :( I have an idea that might help this problem from another angle, by reducing the number of types being resolved in the first place... hopefully that will help. -Chris
Err, right. This patch = http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040209/011392.html
Ok, this patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040209/011396.html Helped a lot with the count, but not with time. Before: 740332 bytecodewriter - Number of bytecode bytes written 61946 type - num types actually equal 6328911 type - num typeequals calls 104010 type - num slow types 25.410u 0.180s 0:25.65 99.7% 0+0k 0+0io 453pf+0w Now: 740332 bytecodewriter - Number of bytecode bytes written 15790 type - num types actually equal 324889 type - num typeequals calls 55319 type - num slow types 21.720u 0.350s 0:22.22 99.3% 0+0k 0+0io 453pf+0w That reduced the number of TypeEquals calls by 1/2, but didn't help that much. Disappointing. -Chris
Ok, it turns out that one big problem was the call to TypeHasCycleThroughItself, which is required to decide whether to use the "fast path" or the "slow path". Because it needs to be called on every type, its speed is kinda important. By using the observation that only abstract types are interesting, we can speed the process up a lot. This patch speeds the testcase up to 11.21s: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040209/011397.html It also slightly reduces the number of TypeEquals calls, though not significantly: 740332 bytecodewriter - Number of bytecode bytes written 15790 type - num types actually equal 319413 type - num typeequals calls 55135 type - num slow types 11.210u 0.190s 0:11.55 98.7% 0+0k 0+0io 451pf+0w -Chris
This patch: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040209/011399.html Speeds up the parser. It takes llvm-as from 11.2s -> 5.13s. -Chris
At this point, I'm going to declare this bug fixed. The testcase now takes ~5s to parse instead of 60s, and most of that time is spend in the lexer and the bison machinery (it's a 10M .ll file after all). The profile is basically flat now, with no big hotspot. Fixing this bug required improvments in the upreference handling code in the asmparser, the type resolution code in the VMCore library, and value resolution code in the asmparser. These should all be useful to LLVM "in general", as opposed to this one special case. -Chris