Bugzilla – Bug 107
AsmParser Misses Symbol Redefinition Error
Last modified: 2003-11-11 22:44:25
You need to log in before you can comment on or make changes to this bug.
Test Case: void %test() { %X = add int 0, 1 %X = add int 1, 2 ret void } This assembles fine (no errors/warnings) because the second add gets renamed. However, it should be a symbol redefinition error.
Analysis: I tracked down the setValueName function in llvmAsmParser.y which is called by the Inst production to name the value (instruction). setValueName calls Instruction::setName if there is no existing name. This function requires that the Instruction have a Function grandparent so it can insert the name in the Function's symbol table. However, at this stage of its incarnation, the Instruction object never has a parent or a grandparent. The instruction's getParent() always returns null. So, no named instructions are ever getting into the symbol table from the Inst production. That being the case, I'm not sure how llvm-as ever worked! I believe the problem is one of timing. The setValueName function is being called in the Inst production before the instruction is assigned its parent. The call to InsertValue that immediately follows setValueName requires the value (instruction) to be named by calling Value->hasName(). If it doesn't have a name then InsertValue returns immediately with -1 and it never gets inserted. Again, how could llvm-as ever work if this was the case? This two findings seem a little odd to me so I'm probably missing something. Can someone more knowledgeable comment on my analysis?
Oh, I see what's going on. You've got it exactly right, you've just missed one step. In LLVM, when instructions are added to Basic blocks (or bb's to functions, or fn's to modules...), the symbol table for the function is automatically updated to contain the name for the instruction if the bb is in a function. If not, when the BB gets added to a function, the BB itself and all its instructions get added to the symbol table. Because we don't want all of the LLVM transformations to have to worry about colliding variable names, the symbol table class autorenames stuff for its clients. What's happening here is that the instruction gets created with the requested name, but is never inserted into the symbol table until this production runs: BasicBlockList : BasicBlockList BasicBlock { ($$ = $1)->getBasicBlockList().push_back($2); } | FunctionHeader BasicBlock { // Do not allow functions with 0 basic blocks ($$ = $1)->getBasicBlockList().push_back($2); }; This is when a newly formed basic block gets added to a function. At this point, we should be checking for redefinitions, and while we're at it, should not allow basic blocks with colliding names either. This probably needs to be extended to support detection of collisions of all of the major LLVM objects: Instruction, BasicBlock, Function, Global Variable Although I think that some of them are already handled (functions and gvars). -Chris
Okay, let me spin on this a while. I understand conceptually where the fix needs to go but need to track down the "right" place to put it in. I need to figure out how to cleanly separate collision detection from reporting the duplicate symbo name in AsmParser. Thanks for your help and for wisely assignign this problem to me. I'm learning tons about LLVM just from working this bug.
Kewl :)
Created an attachment (id=17) [details] A patch to llvmAsmParser.y to ensure no locally scoped symbol redefinitions. This patch adds a "local_symtab" member to the AsmParser's CurFun object (PerFunctionInfo). It is a SymbolTable to keep track of the local symbols in the function. Any redefinition of a symbol in a type plane added to the function will cause a redefinition error. FWIW: There might be other ways to solve this problem.
PLEASE NOTE: The patch must be added *AFTER* the patch for bug 109 (namespace) because it was derived from llvmAsmParser.y after the namespace change.
Looks good, but please follow the "golden rule" in the coding standards doc: http://llvm.cs.uiuc.edu/docs/CodingStandards.html#introduction Since this is a small modification to a preexisting file, please follow the style in use in the file. In particular, this means: * "local_symtab" should be "LocalSymtab" or something * the 'if ( inFunctionScope() ) ' should be 'if (inFunctionScope()) '. (likewise with existing) Otherwise, it looks good! -Chris
Created an attachment (id=18) [details] This patch follows the coding standards better.
Looks great. I'll apply it sometime tonight. Thanks! -Chris
Patch applied, and testcase added: http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20031110/009406.html http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20031110/009407.html http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20031110/009408.html http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20031110/009409.html Thanks! -Chris