First Last Prev Next    No search results available
Details
: AsmParser Misses Symbol Redefinition Error
Bug#: 107
: libraries
: LLVM assembly language parser
Status: RESOLVED
Resolution: FIXED
: All
: All
: 1.0
: P2
: minor
: 1.1

:
: accepts-invalid
:
:
  Show dependency tree - Show dependency graph
People
Reporter: Reid Spencer <rspencer@reidspencer.com>
Assigned To: Reid Spencer <rspencer@reidspencer.com>
:

Attachments
A patch to llvmAsmParser.y to ensure no locally scoped symbol redefinitions. (1.96 KB, patch)
2003-11-11 15:49, Reid Spencer
Details
This patch follows the coding standards better. (1.95 KB, patch)
2003-11-11 18:36, Reid Spencer
Details


Note

You need to log in before you can comment on or make changes to this bug.

Related actions


Description:   Opened: 2003-11-09 22:52
Test Case:

void %test() {
  %X = add int 0, 1
  %X = add int 1, 2
  ret void
}

This assembles fine (no errors/warnings) because the second add gets renamed.
However, it should be a symbol redefinition error.
------- Comment #1 From Reid Spencer 2003-11-09 23:49:38 -------
Analysis:

I tracked down the setValueName function in llvmAsmParser.y which is called by
the Inst production to name the value (instruction). setValueName calls
Instruction::setName if there is no existing name. This function requires that
the Instruction have a Function grandparent so it can insert the name in the
Function's symbol table. However, at this stage of its incarnation, the
Instruction object never has a parent or a grandparent. The instruction's
getParent() always returns null. So, no named instructions are ever getting into
the symbol table from the Inst production. That being the case, I'm not sure how
llvm-as ever worked!

I believe the problem is one of timing. The setValueName function is being
called in the Inst production before the instruction is assigned its parent. The
call to InsertValue that immediately follows setValueName requires the value
(instruction) to be named by calling Value->hasName(). If it doesn't have a name
then InsertValue returns immediately with -1 and it never gets inserted. Again,
how could llvm-as ever work if this was the case?

This two findings seem a little odd to me so I'm probably missing something.

Can someone more knowledgeable comment on my analysis?
------- Comment #2 From Chris Lattner 2003-11-09 23:57:17 -------
Oh, I see what's going on.  You've got it exactly right, you've just missed one
step.

In LLVM, when instructions are added to Basic blocks (or bb's to functions, or
fn's to modules...), the symbol table for the function is automatically updated
to contain the name for the instruction if the bb is in a function.  If not,
when the BB gets added to a function, the BB itself and all its instructions get
added to the symbol table.

Because we don't want all of the LLVM transformations to have to worry about
colliding variable names, the symbol table class autorenames stuff for its clients.

What's happening here is that the instruction gets created with the requested
name, but is never inserted into the symbol table until this production runs:

BasicBlockList : BasicBlockList BasicBlock {
    ($$ = $1)->getBasicBlockList().push_back($2);
  }
  | FunctionHeader BasicBlock { // Do not allow functions with 0 basic blocks   
    ($$ = $1)->getBasicBlockList().push_back($2);
  };

This is when a newly formed basic block gets added to a function.  At this
point, we should be checking for redefinitions, and while we're at it, should
not allow basic blocks with colliding names either.

This probably needs to be extended to support detection of collisions of all of
the major LLVM objects:
  Instruction, BasicBlock, Function, Global Variable

Although I think that some of them are already handled (functions and gvars).

-Chris
------- Comment #3 From Reid Spencer 2003-11-10 00:05:11 -------
Okay, let me spin on this a while. I understand conceptually where the fix needs
to go but need to track down the "right" place to put it in. I need to figure
out how to cleanly separate collision detection from reporting the duplicate
symbo name in AsmParser.

Thanks for your help and for wisely assignign this problem to me. I'm learning
tons about LLVM just from working this bug.
------- Comment #4 From Chris Lattner 2003-11-10 00:20:37 -------
Kewl  :)
------- Comment #5 From Reid Spencer 2003-11-11 15:49:02 -------
Created an attachment (id=17) [details]
A patch to llvmAsmParser.y to ensure no locally scoped symbol redefinitions.

This patch adds a "local_symtab" member to the AsmParser's CurFun object
(PerFunctionInfo). It is a SymbolTable to keep track of the local symbols in
the function. Any redefinition of a symbol in a type plane added to the
function will cause a redefinition error.

FWIW: There might be other ways to solve this problem.
------- Comment #6 From Reid Spencer 2003-11-11 15:50:09 -------
PLEASE NOTE:

The patch must be added *AFTER* the patch for bug 109 (namespace) because it was
derived from llvmAsmParser.y after the namespace change.
------- Comment #7 From Chris Lattner 2003-11-11 15:56:46 -------
Looks good, but please follow the "golden rule" in the coding standards doc:
http://llvm.cs.uiuc.edu/docs/CodingStandards.html#introduction

Since this is a small modification to a preexisting file, please follow the
style in use in the file.  In particular, this means:

* "local_symtab" should be "LocalSymtab" or something
* the 'if ( inFunctionScope() ) ' should be 'if (inFunctionScope()) '. (likewise  
  with existing)

Otherwise, it looks good!

-Chris
------- Comment #8 From Reid Spencer 2003-11-11 18:36:47 -------
Created an attachment (id=18) [details]
This patch follows the coding standards better.
------- Comment #9 From Chris Lattner 2003-11-11 18:38:14 -------
Looks great. I'll apply it sometime tonight.

Thanks!

-Chris

First Last Prev Next    No search results available