Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve codegen for C style enums #1096

Open
jrmuizel opened this issue Oct 24, 2017 · 7 comments
Open

Improve codegen for C style enums #1096

jrmuizel opened this issue Oct 24, 2017 · 7 comments

Comments

@jrmuizel
Copy link
Contributor

Because C style enums are loosely typed and aren't typed they tend to be expressed in C more loosely than they are actually used.

For example, one might have an enum like:
https://github.com/zyantific/zydis/blob/4dd632463c122ee85ba9c5f0c1e56ac748bf246e/include/Zydis/DecoderTypes.h#L478

but that enum is stored in a ZydisCPUFlagAction/uint8_t (https://github.com/zyantific/zydis/blob/4dd632463c122ee85ba9c5f0c1e56ac748bf246e/include/Zydis/DecoderTypes.h#L473, https://github.com/zyantific/zydis/blob/4dd632463c122ee85ba9c5f0c1e56ac748bf246e/include/Zydis/DecoderTypes.h#L816).

It would be nice to be able map the ZydisCPUFlagAction type to the ZydisCPUFlagActions enum somehow.

@fitzgen
Copy link
Member

fitzgen commented Oct 24, 2017

Can you be more specific, and give an example definittion, its current translation, and what you would prefer instead? Which style of enum translation are you using? Default (into constants)? --constified-enum-modules? --rustified-enum-modules?

@jrmuizel
Copy link
Contributor Author

From above: https://github.com/zyantific/zydis/blob/4dd632463c122ee85ba9c5f0c1e56ac748bf246e/include/Zydis/DecoderTypes.h#L478

So currently constified enums are being used so we get something like: (notice the difference between the field type ZydisCPUFlagAction and the enum type ZydisCPUFlagActions).

/// @brief   Defines the @c ZydisCPUFlagAction datatype.
pub type ZydisCPUFlagAction = u8;

pub const ZYDIS_CPUFLAG_ACTION_NONE: ZydisCPUFlagActions = 0;
pub const ZYDIS_CPUFLAG_ACTION_TESTED: ZydisCPUFlagActions = 1;
pub const ZYDIS_CPUFLAG_ACTION_MODIFIED: ZydisCPUFlagActions = 2;
pub const ZYDIS_CPUFLAG_ACTION_SET_0: ZydisCPUFlagActions = 3;
pub const ZYDIS_CPUFLAG_ACTION_SET_1: ZydisCPUFlagActions = 4;
pub const ZYDIS_CPUFLAG_ACTION_UNDEFINED: ZydisCPUFlagActions = 5;
pub const ZYDIS_CPUFLAG_ACTION_MAX_VALUE: ZydisCPUFlagActions = 5;
/// @brief   Values that represent CPU-flag actions.
pub type ZydisCPUFlagActions = ::std::os::raw::c_uint;

I'd like to be able to turn this into something like:

#[repr(u8)]
enum ZydisCPUFlagAction {
  NONE = 0,
  TESTED = 1,
  MODIFIED = 2,
  SET_0 = 3,
  SET_1 = 4,
  UNDEFINED = 5,
  MAX_VALUE = 5,
}

/// @brief   Values that represent CPU-flag actions.
pub type ZydisCPUFlagActions = ::std::os::raw::c_uint;

@fitzgen
Copy link
Member

fitzgen commented Oct 24, 2017

Using --rustified-enum '.*', this is outputted:

/* automatically generated by rust-bindgen */

pub const ZydisCPUFlagActions_ZYDIS_CPUFLAG_ACTION_MAX_VALUE : ZydisCPUFlagActions = ZydisCPUFlagActions :: ZYDIS_CPUFLAG_ACTION_UNDEFINED ;
#[repr(u32)]
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
pub enum ZydisCPUFlagActions {
    ZYDIS_CPUFLAG_ACTION_NONE = 0,
    ZYDIS_CPUFLAG_ACTION_TESTED = 1,
    ZYDIS_CPUFLAG_ACTION_MODIFIED = 2,
    ZYDIS_CPUFLAG_ACTION_SET_0 = 3,
    ZYDIS_CPUFLAG_ACTION_SET_1 = 4,
    ZYDIS_CPUFLAG_ACTION_UNDEFINED = 5,
}

You have to be careful, though -- and this is why rustified-enum isn't the default -- because rustc relies on the assumption that there will never be a ZydisCPUFlagsActions whose value is 6, and breaking that assumption is UB.

@jrmuizel
Copy link
Contributor Author

I need ZydisCPUFlagAction to be an enum not ZydisCPUFlagActions and I need it to be repr(u8)

@lilianmoraru
Copy link

Shouldn't the enums be "i32" by default?(These are "int" by default in C/C++), unless the code specifically wants another type?
Seems weird that bindgen decides to use u32 when the value is not negative, while the C code uses i32(in C you could overflow, maybe intentionally and in Rust it will just fit).

@emilio
Copy link
Contributor

emilio commented Feb 15, 2018

Bindgen uses whatever enum type comes from clang, so if it's a u32, clang is picking a u32 type for it.

@emilio
Copy link
Contributor

emilio commented Feb 15, 2018

Also, signed integer overflow is auto-UB even in C, so it's not supposed to overflow :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants