Bug #18695 2012-08-04 23:11
ohir
Parsing GDB output failed
Linux 2.6.32-41 i686 GNU/Linux C::B version svn8150 GDB parser goes astray for certain kind of input (strings). Watch head contains "Parsing GDB output failed for ..." error message. Dereferenced struct view usually is incomplete. Cause: Unknown. Seems like a bad (overeager) regexp code going into the content of the string. How to replicate: Watch dereferenced 't' in code below: [code] #include <malloc.h> const char ok1[]="\"a\" . b , c"; const char ok2[]="\"a\" b c"; const char ok3[]="\"a\" b ; c"; const char ok4[]="\"a\" b , c"; const char bad[]="\"a\" b , c"; /* note 11: spaces between " and b. 10 spaces is ok. */ struct test { const char* s; }; int main(void) { struct test *t; t = calloc(1,sizeof(struct test)); t->s = ok1; t->s = ok2; t->s = ok3; t->s = ok4; t->s = bad; t->s = ok4; free(t); return 0; } [/code]
- Category
- Debugger
- Group
- Platform:All
- Status
- Open
- Close date
- Assigned to
- tpetrov
History
As submit cleaner ate spaces within code, below is block
of test strings with spaces represented by '-' char.
If you are gonna to replicate test scenario, you need to
replace '-' with spaces back.
const-char-ok1[]="\"a\"-.---------b-,-c";
const-char-ok2[]="\"a\"-----------b---c";
const-char-ok3[]="\"a\"-----------b-;-c";
const-char-ok4[]="\"a\"----------b-,-c";
const-char-bad[]="\"a\"-----------b-,-c";
in file: src/plugins/debuggergdb/parsewatchvalue.cpp in function: line 113: GetNextToken Seem that tokenizer did not get account for escape char being at the start. It just skips opening \ . Here is obvious patch: --- parsewatchvalue.cpp.orig +++ parsewatchvalue.cpp @@ -122,6 +122,7 @@ token.start = -1; bool in_quote = false; + bool escape_next = false; int open_braces = 0; struct BraceType { enum Enum { None, Angle, Square }; }; BraceType::Enum brace_type = BraceType::None; @@ -141,6 +142,11 @@ token = Token(pos, pos + 1, Token::CloseBrace); return true; + case _T('\\'): + escape_next = true; + token.type = Token::String; + token.start = pos; + break; case _T('"'): in_quote = true; token.type = Token::String; @@ -164,7 +170,6 @@ } ++pos; - bool escape_next = false; while (pos < static_cast<int>(str.length())) { if (open_braces == 0) ================= I have no time to set up env for compiling C::B by myself, so this patch was NOT tested.
Hi, I can confirm the bug under Windows + gcc + gdb.
But I see that gdb does not print the whole string buffer, somethings, it will print something like:
[debug]{s = 0x40b0d3 "\"a\"", ' ' <repeats 11 times>, "b , c"}>>>>>>cb_gdb:
Then, the gdb parser always stop after the ">", I mean the "b , c" is not shown on the watches, this is another kind of bug, right?
>>>>>> is the prompt as set by C::B. I do not know if this is another kind of bug. but I can tell that for parsing cited output tokenizer has another bug: It tries to check for repeated chars, while "in_quote" state. That is wrong for output: "\"a\"", ' ' <repeats 11 times>, "b , c" because phrase <repeats 11 times> is NOT within if(in_quote) scope. \"a\" is in_quote, then quote closes. "b , c" again is in quote. 184: if (in_quote) { if (!escape_next) { int newPos = DetectRepeatingSymbols(str, pos); Tokenizer code itself is enough convulted to have more quirks like that. Can't tell for the rest. Someone needs to test it, um, with debugger step by step with enough sample outputs to see where and how assumptions and presumptions bent the code to fail.
Indeed, it has the bug of strings.
Currently, it both happens in Windows and Linux system, so I change the platform.
BTW: Currently there is a on-going work of the gdb-mi plugin, but that may have the same issue. (if you have time, you can have a try)
http://forums.codeblocks.org/index.php/topic,16230.msg109713.html#msg109713
Thanks for your detailed report.
Hi, ohir, today, I have a chance to test the gdb-mi plugin for C::B, I can't find the bug you reported in the new plugin. If you are brave enough, you can use this gdb-mi plugin(build it from source), Thanks.