Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature: Instruction #767

Closed
mr-tz opened this issue Sep 9, 2021 · 6 comments · Fixed by #930
Closed

New feature: Instruction #767

mr-tz opened this issue Sep 9, 2021 · 6 comments · Fixed by #930
Labels
enhancement New feature or request
Milestone

Comments

@mr-tz
Copy link
Collaborator

mr-tz commented Sep 9, 2021

Summary

Often we want to check for specific instructions and use the basic block scope for this.

Example from check for software breakpoints

  features:
    - and:
      - basic block:
        - and:
          - mnemonic: cmp
          - or:
            - number: 0xCC
            - and:
              - number: 0xCD
              - number: 0x3
      - match: contain loop

I'm not sure if we need complete flexibility, but a way to provide variable operand values would be neat.

Examples:

- instruction: cmp <reg>, 0xCC
- instruction: cmp <mem>, 0xCC
- instruction: cmp <var = anything>, 0xCC

Key Instruction fields:

  • mnemomic
  • operands
    • reg/imm/mem
    • size
    • displacement
  • (prefixes) - optional

Or as a "subscope"?

- instruction:
  - mnemonic: cmp
  - operand1: 0xCC

We could then continue to support the mnemonic feature as an alias for the new instruction feature.

Motivation

An instruction scope would allow for more expressive and concise rules.

Downsides

More complexity for extractors and potential performance hits.

@mr-tz mr-tz added the enhancement New feature or request label Sep 9, 2021
@mike-hunhoff
Copy link
Collaborator

I would love to see this implemented. I'd prefer the subscope route as it would allow user's to easily specify environment specifics like the architecture.

- instruction:
  - arch: i386
  - mnemonic: cmp
  - operand1: eax
  - operand2: ecx

@williballenthin
Copy link
Collaborator

This does sound useful.

I think this will negatively affect performance; however, this is not a good reason to reduce the expressivity of capa. It just lends further support to #602 to investigate better matching algorithms.

@williballenthin
Copy link
Collaborator

the grap tool has a mini-language for declaring patterns of instructions like this. if it makes sense we could try to reuse terms from that project. note, i'm not aware of a large corpus of grap rules, so there's probably not a huge overlap/reuse argument to be made here.

https://github.com/QuoSecGmbH/grap/

@williballenthin
Copy link
Collaborator

can we build out some examples of rules we'd want this feature/scope for? lets list them in this thread.

why? i'm having a little trouble thinking about which features we need/don't need. e.g., do we need to name the registers? can we get by with the existing features (mnemonic, number, offset, arch)?

@mr-tz
Copy link
Collaborator Author

mr-tz commented Sep 10, 2021

Just some examples from existing rules where this would be helpful.

      - basic block:
        - and:
          - mnemonic: and
          - number: 0x2 = KdDebuggerNotPresent
      - basic block:
        - and:
          - mnemonic: and
          - number: 0x1 = KdDebuggerEnabled
- and:
              - mnemonic: add
              - or:
                - number/x32: 0x68 = PEB.NtGlobalFlag
                - number/x64: 0xBC = PEB.NtGlobalFlag
- basic block:
          - and:
            - mnemonic: cmp
            - number: 0x5E = '^' (Track 1 separator)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x3D = '=' (Track 2 separator)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x25 = '%' (Track 1 start sentinel)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x42 = 'B' (Format code)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x44 = 'D' (Format code)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x3F = '?' (Track 1 & 2 end sentinel)
        - basic block:
          - and:
            - mnemonic: cmp
            - number: 0x3B = ';' (Track 2 start sentinel)
- basic block:
        - and:
          - or:
            - mnemonic: cmp
            - mnemonic: test
          - or:
            - number: 200 = OK
            - number: 400 = Bad Request
            - number: 401 = Unauthorized
            - number: 403 = Forbidden
            - number: 404 = Not Found
            - and:
              - number: 0xFFFF000F = -65521
              - mnemonic: add
            - and:
              - number: 0xFFF1 = 65521
              - mnemonic: sub
      - basic block:
        - and:
          - mnemonic: cmp
          - number: 32000
      - basic block:
        - and:
          - mnemonic: cmp
          - or:
            - number: 127
            - number: 128

We should think of additional useful feature (combinations) we may not be able to express currently with our rules.

Initially, I like the idea of using the existing features.

@williballenthin
Copy link
Collaborator

piggy back on syntax discussion in #921

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants