Added Motorola 68k assembly lexer (Devpac syntax) #909

nguillaumin · 2018-05-11T05:21:17Z

Added new lexer for Motorola 68000 assembly, for the Devpac syntax.
There are numerous assembler for the 68000 with slightly different
syntax. Devpac is the only one I know, but it could be used as a starting
point to implement the other variants.

It doesn't cover the "older" Devpac syntax where comments can be put
after the last operand, at the end of the line, without a separator (;
or *). This would require the lexer to know the semantics of the
language, and that was a bit too much for my first lexer 😉

I started it from the NASM lexer, but it's a bit simpler.

I initially associated it with *.asm files but that conflicts with
NASM. I also associated it with the text/x-asm mime type, but I
removed it as I'm not sure if that makes sense (why this one, and not
NASM?)

As a result there are no tests, since there's nothing to test in terms
of file/mime association.

Added new lexer for Motorola 68000 assembly, for the Devpac syntax. There are numerous assembler for the 68000 with slightly different syntax. Devpac is the only one I know, but it could be used as a starting point to implement the other variants. It doesn't cover the "older" Devpac syntax where comments can be put after the last operand, at the end of the line, without a separator (`;` or `*`). This would require the lexer to know the semantics of the language, and that was a bit too much for my first lexer 😉 I started it from the NASM lexer, but it's a bit simpler. I initially associated it with `*.asm` files but that conflicts with NASM. I also associated it with the `text/x-asm` mime type, but I removed it as I'm not sure if that makes sense (why this one, and not NASM?) As a result there are no tests, since there's nothing to test in terms of file/mime association.

nguillaumin · 2018-05-11T05:21:53Z

This is my first attempt and I'm not especially familiar with Ruby, so let me know if there's anything I can improve, thanks!

dblessing · 2018-06-13T20:28:10Z

Thanks for the PR @nguillaumin.

How is/isn't the Motorola 68000 assembler related to NASM? As this is currently largely a copy/paste of NASM I think we should consider how they're related and see if subclasses/delegating would be appropriate here.

Note from the README:

Please don't submit lexers that are largely copy-pasted from other files.

Happy to help you through this if you can help me understand the relationship/differences between the two.

nguillaumin · 2018-06-14T05:12:14Z

Hi,

NASM is an assembler for the x86 platform (i.e. PCs). This syntax is for assembly of Motorola 68000 processors which is a completely different architecture. It was for example used on older computers like Atari ST and Amiga.

As such, there's not much that can be shared between the two, as they have completely different instructions and syntax.

The only thing that's common is that it's assembler so it's mostly single-line instruction oriented and lacks control structures or higher level constructs, but I don't think it would make sense to try to refactor that. It would be similar to trying to use a common base for C and Java just because they both use braces and have if and for statements, for example.

I don't believe it's largely a copy/paste of NASM, if you diff the two you'll see they differ quite significantly. Apologies if my comment was misleading, I just wanted to mention I used the NASM one as a starting point. I followed the guide which suggested starting from an existing one (which is a good advice IMHO).

I hope this helps?

dblessing

Few minor comments. Thanks for your contribution @nguillaumin and for explaining. I commented on why I thought it a copy/paste.

dblessing · 2018-06-13T20:15:42Z

lib/rouge/lexers/m68k_devpac.rb

+      def self.keywords_type
+        @keywords_type ||= Set.new %w(
+            dc ds dcb
+        )


Indent the keywords two spaces.

@keywords_type ||= Set.new %w( dc ds dcb )

Do this throughout, please.

dblessing · 2018-06-13T20:21:10Z

lib/rouge/lexers/m68k_devpac.rb

+      end
+
+      state :inline_whitespace do
+        rule /[ \t\r]+/, Text


Is the space before the \t to match on a space character? If so, use \s.

I think the semantics differ because \s will match more than just a space (It will match [ \t\r\n\f]) but to be honest I don't really understand this part and using \s works, so fine by me.

dblessing · 2018-06-13T20:22:56Z

lib/rouge/lexers/m68k_devpac.rb

+      state :expr_whitespace do
+        rule /\n+/m, Text, :expr_bol
+        mixin :whitespace
+      end


What added value does expr_whitespace have? The first rule is identical to whitespace and then it mixes in whitespace.

This is the rule that made me think it was a direct copy/paste of nasm. It has this state and I'm not sure why it's needed. It seems to be a paradigm that has traveled through about 4 different lexers. Unless we know why it's needed, let's leave it out.

That's fair, this is indeed coming from NASM and I can't say I understand it fully. I removed it.

dblessing · 2018-06-20T03:05:44Z

Yes, you’re right. I’m ok with either way you decide to go.

…

On Tue, Jun 19, 2018 at 9:40 PM Nicolas Guillaumin ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In lib/rouge/lexers/m68k_devpac.rb <#909 (comment)>: > + @builtins ||=Set.new %w( + d0 d1 d2 d3 d4 d5 d6 d7 + a0 a1 a2 a3 a4 a5 a6 a7 a7' + pc usp ssp ccr + ) + end + + start { push :expr_bol } + + state :expr_bol do + mixin :inline_whitespace + rule(//) { pop! } + end + + state :inline_whitespace do + rule /[ \t\r]+/, Text I think the semantics differ because \s will match more than just a space (It will match [ \t\r\n\f]) but to be honest I don't really understand this part and using \s works, so fine by me. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#909 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACSKlHNgAiH4Va2MzkUZOfvkFcjqPLT9ks5t-bYogaJpZM4T66yH> .

hanklank · 2018-07-03T15:34:59Z

First, thanks for the job of implementing m68k-syntax, I was just about to read up on how to do it and was glad to see this. However, I think the should be called m68k just, even if it is initially only tested with one 68k-compiler (devpac). Why? Because m68k for the other compilers have almost the very same syntax in most regards, and It wouldn't make sense, that for example, I next would add an m68k-asmone, m68k-vasm. Let's try to have all the m68k-support in the same lexer. So I suggest to drop the devpac suffix. If I get the time, I'll look into adding full support for code written in some of the others.

nguillaumin · 2018-07-03T20:44:30Z

I'm only familiar with the Devpac one and I assumed other syntaxes would be different enough to warrant a separate lexer. If we can support all syntaxes in a single lexer though, that's great! 👍

hanklank · 2018-07-05T09:09:40Z

I will test your highlighting on a few codebases I have in other 68k assemblers, and see what I can find , and try to suggest additions or minor changes. I noticed Vim and Atom has m68k-highlighting so one could also peek there to get some inspiration. I have never done Ruby before, but I think I understand enough of the lexers now to be able to contribute. Also, we should think about how to identify a file, maybe we could make the guessing better first give it the common extensions (s,i, asm, 68k, more?), and then guess by instructions in the code like move (nasm doesn't have this I think) etc.

Rouge only supports utf-8 though as most codebases in m68k are not (unless they are converted on github/lab etc), but I guess utf-8 only is a design decision and will never change? - Asking @jneen

nguillaumin · 2018-07-06T02:16:50Z

I'm keen to get something merge soon-ish as I need the m68k syntax for another project. So even if we intend to make improvements later, I'll probably just remove the devpac suffix on this branch so that it can be merged as-is, as an initial version, if everyone is ok with that?

nguillaumin · 2018-07-10T04:56:46Z

So I've pushed a commit that does that, removing the Devpac suffix. I think it's good to go (Subsequent improvements can be made on a separate PR). The PR still says "Changes requested" but I think I have addressed these? Let me know if that weren't the case, thanks!

hanklank · 2018-07-11T08:25:15Z

Sound good, I don't know if/when I get the time to contribute more to it, so just merge, I'm happy with the devpac suffix removed.

hanklank · 2018-07-13T09:35:48Z

@nguillaumin FYI, I've started working on improvements, almost done, after testing on various source codes and looking at the 68k spec. It is covering more cases. The cases with with free style comments possible to follow any opcode or operand made me have to refactor a your contribution a bit and make it more of a flow with a few states. Also, I added a test with s,i, m68k and also looking for move.(bsw) and register d0 in the source which is very characteristic of 68k, and should be good enough to identify a 68k file. I still suggest however this PR is merged first and then I'll add my contribution in a separate PR later, as not to confuse, or delay this PR even longer. @dblessing

nguillaumin · 2018-07-17T02:22:59Z

The build is failing, but I don't think that's related to my changes from what I can tell...

nguillaumin · 2018-08-09T00:39:15Z

Is there anything I can do to help progress this? Thanks!

nguillaumin · 2018-08-10T23:58:33Z

Excellent, thanks!

dblessing added the author-action The PR has been reviewed but action by the author is needed label Jun 13, 2018

dblessing requested changes Jun 19, 2018

View reviewed changes

Nicolas Guillaumin added 2 commits June 19, 2018 19:19

Fix indentation

4bd419b

Address code review comments

e9b3be8

Remove Devpac suffix as per pull requests discussions

20755ef

dblessing approved these changes Aug 10, 2018

View reviewed changes

dblessing merged commit 3d2dc84 into rouge-ruby:master Aug 10, 2018

pyrmont removed the author-action The PR has been reviewed but action by the author is needed label Jun 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Motorola 68k assembly lexer (Devpac syntax) #909

Added Motorola 68k assembly lexer (Devpac syntax) #909

nguillaumin commented May 11, 2018

nguillaumin commented May 11, 2018

dblessing commented Jun 13, 2018

nguillaumin commented Jun 14, 2018 •

edited

Loading

dblessing left a comment

dblessing Jun 13, 2018

dblessing Jun 13, 2018

nguillaumin Jun 20, 2018

dblessing Jun 13, 2018

nguillaumin Jun 20, 2018

dblessing Jun 13, 2018

dblessing Jun 19, 2018

nguillaumin Jun 20, 2018

dblessing commented Jun 20, 2018 via email

hanklank commented Jul 3, 2018

nguillaumin commented Jul 3, 2018

hanklank commented Jul 5, 2018

nguillaumin commented Jul 6, 2018

nguillaumin commented Jul 10, 2018

hanklank commented Jul 11, 2018

hanklank commented Jul 13, 2018

nguillaumin commented Jul 17, 2018

nguillaumin commented Aug 9, 2018

nguillaumin commented Aug 10, 2018

Added Motorola 68k assembly lexer (Devpac syntax) #909

Added Motorola 68k assembly lexer (Devpac syntax) #909

Conversation

nguillaumin commented May 11, 2018

nguillaumin commented May 11, 2018

dblessing commented Jun 13, 2018

nguillaumin commented Jun 14, 2018 • edited Loading

dblessing left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dblessing commented Jun 20, 2018 via email

hanklank commented Jul 3, 2018

nguillaumin commented Jul 3, 2018

hanklank commented Jul 5, 2018

nguillaumin commented Jul 6, 2018

nguillaumin commented Jul 10, 2018

hanklank commented Jul 11, 2018

hanklank commented Jul 13, 2018

nguillaumin commented Jul 17, 2018

nguillaumin commented Aug 9, 2018

nguillaumin commented Aug 10, 2018

nguillaumin commented Jun 14, 2018 •

edited

Loading