-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected token #41
Comments
I've seen this before too. The tokenizer in the assembler breaks up mov.b into three tokens: mov, ., and b . Before the assembler even sees the "b", it is looked up in the symbol table and the macro table. Since it's in the symbol table as a 0, it gets replaced with 0. I've thought before about having it tokenize as a single mov.b and breaking it up later in the assembler, but it would be quite a huge change since lots of the other assemblers and directives expect the dot to be a separate token. I'm not sure what's the best thing to do. |
I've seen this before too. The tokenizer in the assembler breaks up mov.b into three tokens: mov, ., and b . Before the assembler even sees the "b", it is looked up in the symbol table and the macro table. Since it's in the symbol table as a 0, it gets replaced with 0.
But then this line should be a syntax error, shouldn't it?
* `mov` is the opcode
* `b` is the source
* `.` between opcode and source => error?
I've thought before about having it tokenize as a single mov.b and breaking it up later in the assembler, but it would be quite a huge change since lots of the other assemblers and directives expect the dot to be a separate token.
I'm not sure what's the best thing to do.
I have no experience with handwritten lexers but wouldn't it be possible to keep the tokens in a queue/stack per line and when the (MSP430-specific?) parser sees *opcode*, '.', 'b' it could reassemble it into a byte instruction?
Note: nevertheless I prefer naken_asm to CCS/IAR :)
…
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#41 (comment)
|
The assembler never sees the b... in your case it sees mov.0 because mov.b in your source code is separated into 3 tokens where the third (the b) is replaced by 0. The assembler was expecting to see a b,w, or a if it got a . after the instruction name. naken_asm is pretty modular.. main code just breaks up the source code into tokens and streams them to whatever assembler module is currently selected (with .msp430, it will get sent to the msp430 assembler.. if halfway through the source code there is a .arm then the tokens will get streamed to the ARM assembler). You've actually stumbled on something kind of .. not pretty. You could technically write source code that looks like: mov . b and naken_asm will treat it like mov.b. My guess Is it would take a Saturday to change the way it works (since it affects multiple assemblers)... so the tokenizer sends mov.b as a single token to the assemblers. It never bothered me another to change it, but if it bothers you enough I'll take care of it. It's important to me to make the assembler work well for other people... you might not be the only one who doesn't like the way it works. Glad you like the assembler, btw.. I like it better than CCS/IAR also :) |
Neat.
That's for college. Probably dosn't justify the amount of work involved.
👍 |
Well, it might justify it if it bothers multiple people... I want the assembler to be perfect and I'm not sure if what I did is not the best. I'm tempted just to get it done. |
I guess if every developer would have your mindset quite a few things would be different. |
Hi Mike,
this input (MSP430):
gives this error (version 9c5c74c):
It may be a parser error because
b
toc
mov.b
bymov
makes the assembler happy.
The text was updated successfully, but these errors were encountered: