Even though #211 fixes #183 on a high level, the strange small offsets remain for unrelated reasons.
The PR uses location non-terminals in the grammar to capture locations at intermediate places using the following code:
|
(* More parsing support functions: line, file, char count, char count for line start *) |
|
let getPosition () : int * string * int * int = |
|
let i = !current in |
|
i.linenum, i.fileName, Lexing.lexeme_start i.lexbuf, i.linestart |
I suspect the use of
Lexing.lexeme_start is wrong here.
For example, in a production SEMICOLON location, in the semantic action of location, which itself matches no lexer tokens, the starting position of the most recently lexed token is returned, i.e. the starting position of SEMICOLON, even though we want the location after it.
I suspect this isn't straightforward to fix by just using the token end location instead, because other times want to use location before something else, so things would instead go wrong there.
Calling Lexing functions in the parser is probably wrong anyway. A proper solution might be to use Menhir, which provides much more powerful position facilities in the parser (avoiding the need for these location rules). That's what Frama-C seem to have done with their CIL as well.
Even though #211 fixes #183 on a high level, the strange small offsets remain for unrelated reasons.
The PR uses
locationnon-terminals in the grammar to capture locations at intermediate places using the following code:cil/src/ocamlutil/errormsg.ml
Lines 335 to 338 in 7797529
I suspect the use of
Lexing.lexeme_startis wrong here.For example, in a production
SEMICOLON location, in the semantic action oflocation, which itself matches no lexer tokens, the starting position of the most recently lexed token is returned, i.e. the starting position ofSEMICOLON, even though we want the location after it.I suspect this isn't straightforward to fix by just using the token end location instead, because other times want to use
locationbefore something else, so things would instead go wrong there.Calling
Lexingfunctions in the parser is probably wrong anyway. A proper solution might be to use Menhir, which provides much more powerful position facilities in the parser (avoiding the need for theselocationrules). That's what Frama-C seem to have done with their CIL as well.