-
Notifications
You must be signed in to change notification settings - Fork 957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Dockerfile #3
Conversation
Using Prebuilt Imagedocker run -it --rm thawsystems/retdec Building Docker Imagegit clone https://github.com/Manouchehri/retdec
docker build -t retdec .
docker run -it --rm retdec |
How about removing the maintainer (since it'll be maintained by repo contributors if it ends up in the repo) and doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull i386 Ubuntu image instead of x64, since only 32-bit is supported.
FROM i386/ubuntu:bionic
"Supported architectures (32b only): Intel x86, ARM, MIPS, PIC32, and PowerPC"
@cquick97 That's for target binaries, not the host platform? |
Sure, but if 64-bit isn't required, why not just stick to 32-bit? I noticed that I couldn't compile a test binary within the container with I'm halfway through an i386 build right now, so we'll see if that fixes anything. Or maybe I'm just crazy/dumb and you can ignore my nonsense. |
You can install a 32-bit compiler if needed.
I don't think changing the host is a good idea, going back to 32-bit Linux distributions is a nightmare. Other RE tools like IDA Pro, Binary Ninja, etc are already 64-bit only, I think it's an acceptable requirement. |
That's probably a better idea. Disregard :) |
👏 for the docker image, instead of full-dependencies installation. |
@Manouchehri the main thing I wasn't keen on was cloning the whole repo graph twice - maybe I'm just making suggestions while waiting for official feedback :) |
@breznak I am not sure if I understood your question correctly, but the future support for decompilations of 64b binaries (#9) will be independent from the OS that is used to build and run RetDec. That is, once there is support for decompilations of 64b files, you will be able to use both 32b/64b RetDec builds. |
@s3rvac my idea was that on 64 you can run both 64/32bit apps, if you were to develop, debug something in the image. But basically disregard that, as it boils down to
|
Also, It may be more work to set all the things up (once), but then Alpine could be a lighter alternative than a full-blown Ubuntu. What do you think? |
@breznak While Alpine is lighter, I prefer to use Ubuntu as it's more similar to the platform developers will be using and testing with. |
@aidanhs You do not need to clone any of the submodules before building the image. See the commands listed in #3 (comment). |
@Manouchehri my point is that if you initialise submodules, you can People can use your Dockerfile until they actually want to make and test a change, at which point they need to either edit it to use my proposed solution ( Part of the benefit of Dockerfiles is to ease development, which is what I'm shooting for. |
The problem is My suggestion is to make a separate Dockerfile for intended your development use case. This one is meant for users to get started quickly with using RetDec. |
@Manouchehri I must agree with @aidanhs here. If you clone the repo while building the image, that Dockerfile could never be used to test local changes. That means you can never test to see if the Docker image works without commiting and pushing the change! I also don't think that running At my work we also use Docker. Our general flow is to have the Dockerfile build an image from the local repository. This allows us to test local changes. Then when we want to publish a new version of the image, our CI system fully checks out the master branch and builds the new image. If you are concerned users won't fully clone the repository, could we not add a README.md that explains the image creation process? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! 😊
Some minor feedback...
Dockerfile
Outdated
mkdir build && \ | ||
cd build && \ | ||
cmake .. -DCMAKE_INSTALL_PREFIX=/home/retdec/retdec-install && \ | ||
make && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some way we can auto-detect the number of processors available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be done with a little bash, but we should probably try to improve the normal build process instead of diverging and hacking about in a Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this should be multi-platform, but on linux you can go
make -j$(nproc)
https://unix.stackexchange.com/questions/208568/how-to-determine-the-maximum-number-to-pass-to-make-j-option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also suggest using make -j$(nproc)
because otherwise, only a single core will be utilized for build. Or by setting MAKEFLAGS="-j $(nproc)"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-j
(if the project is huge, it will spawn sometimes to many jobs) and -j $(nproc)
are both good - you need coreutils
for that, but I would expect this to be dragged with something like build-essential
or already be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
ENV HOME /home/retdec | ||
|
||
RUN apt-get -y update && \ | ||
DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential git bc graphviz upx cmake python zlib1g-dev flex bison libtinfo-dev autoconf pkg-config m4 libtool wget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also use --no-install-recommends
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll eventually split this into a multi-stage so the build tools won't need to be in the production image.
DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential git bc graphviz upx cmake python zlib1g-dev flex bison libtinfo-dev autoconf pkg-config m4 libtool wget | ||
|
||
USER retdec | ||
RUN git clone --recursive https://github.com/avast-tl/retdec && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use COPY
.
Could it be optional somehow? Maybe two Dockerfiles: one for local changes and one from upstream master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I suggested.
Since it's not my use case, I'll let someone else create a "development" Dockerfile that they like.
@s3rvac Any feedback from the Avast team? =) |
cd retdec && \ | ||
mkdir build && \ | ||
cd build && \ | ||
cmake .. -DCMAKE_INSTALL_PREFIX=/home/retdec/retdec-install && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we install it into /home/retdec/install
? The additional retdec-
prefix seems redundant to me. Or is there a motivation behind the current path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I use retdec-install
is because if the end user adds more things to /home/retdec/
, it can be a bit ambiguous what an install/
folder is referring to.
Not opposed to doing it either way though, which do you want me to use? =)
Ok - I get that having two dockerfiles might be useful. But in that case, I'd like to suggest renaming the file in this pr to Dockerfile.master or Dockerfile.prod or something, so when I create a PR for a On that note, it'd be cool to have some instructions for using this particular Dockerfile - for example, you will (when this PR is merged) be able to just |
@aidanhs Dockerfiles must be named Once a functional |
@Manouchehri In that case then the path it is under must make it obvious it is for publishing only, not development. ie. '/productionDockerfile/Dockerfile' or the like |
Sure, but why do that if there's currently only one Dockerfile to begin with? If another Dockerfile is written, it should be added to a folder at that point. We're debating folder structure for files that don't exist. |
Honestly if this patch were complete it would have all common use case Dockerfiles. (in our case, production and development) It's better to not have to change readme's, build scripts etc later when the other common use case Dockerfile is added. |
@HugoKlepsch Okay, go do that yourself then. |
FWIW, this isn't true (though I think it used to be the case) - you can see that https://hub.docker.com/r/rustlang/crater/builds/bnftxfvygyjsakpah66cvyh/ uses the |
If that is the case, maybe we can have a |
- MOV, MVN and MOVZ instructions - operand shift functions moved and changed for ARM64 - instructions like 'movz x0, avast#3 LSL 16' work now
- Register parent map - Storing registers - Loading registers - Headers - Need more changes to conversions, I think 'mov w0, avast#3' zeroes out the upper 32bits of x0 register. But need to investigate further.
* Capstone2llvmirtool default basic modes for architectures Run tool with reasonable Capstone basic modes for specified architecture. Default values are as follows: -a arm : CS_MODE_ARM -a arm64 : CS_MODE_ARM [looks like keystone doesn't like this] -a mips : CS_MODE_MIPS32 -a x86 : CS_MODE_32 -a ppc : CS_MODE_32 -a <rest>: CS_MODE_LITTLE_ENDIAN * Base for the ARM64 translator - register maps(_reg2type) - instructions map(_i2fm) Modified ARM Translator unit, Work in progress. * Fix the cs_reg_name - register name could not be found because of the wrong cs_arch in constructor * Add ARM64 support for capstone dependency - capstone was configured without the ARM64 support, this caused cs_open to fail * Temporary solution to call translate function * Status register and program counter added to environment - flags from status register added to arm64 env - program counter added to arm64 env * Methods store/load registers/operands skeletons + add instruction - basic implementation of functions needed for loading and storing operands - translateAdd is for testing purposes * Store instruction base - started implementation of MEM operand type - Store register instruction translation method e.g. retdec-capstone2llvmir -a arm64 -t 'str x0, [x1]' * Operand shifts ported from ARM and MOV instruction tranlation - MOV, MVN and MOVZ instructions - operand shift functions moved and changed for ARM64 - instructions like 'movz x0, #3 LSL 16' work now * Arm64 - tests ported from Arm - test framework capstone2llvmirtranslator - first INS_ADD test - cmake compilation * Basic MOV tests - MOV, MOVZ * Test for STR instruction and test header comments * STP instruction + tests, pc in new enum, get op addr function - Store pair instruction{pre-index, post-index, signed-offset} - test for all cases except 32bit operands - pc moved to its own enum - generateGetOperandAddr to generate address from instruction operand * LDR + STR, LDR tests from ARM, LDP stub - LDR{pre-index, post-index, signed-offset} instruction implemented - STR{pre-index, post-index, signed-offset} instruction implemented - LDR tests ported from ARM - LDP todo * Implemented parent register handling - Register parent map - Storing registers - Loading registers - Headers - Need more changes to conversions, I think 'mov w0, #3' zeroes out the upper 32bits of x0 register. But need to investigate further. * LLVM data layout modified for ARM64 - taken from uname -a in qemu arm64 machine Linux debian-aarch64 4.9.0-4-arm64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) aarch64 GNU/Linux * Removed useless debug output * getCarryRegister for ARM64 fixed * Store register ZEXT_TRUNC, 32bit tests baseline + tests - when writing value to 32bit reg the 64bit, the value is zero extended to the vhole register - parent register mapping enabled in tests - 32bit version of tests * Zero extension tests for ADD and MOV 32bit variants * Implemented BL instruction - added tests for label and imm branch * Implemented RET instruction - added tests * Implemented LDP instruction - added tests for instruction * Implemeneted ADRP instruction - real binary testing is needed - without tests * enable arm64 in decompiler.py and add arm64 architecture in Architecture::setArch() ARM64 needs to be set before ARM because "arm" from ARM matches the "arm aarch64" from ARM64 * Arm64 ABI implementation * Arm64 decoder ported from Arm * Arm64 imm operand shifts should not update flags by default. - Added the option to switch this behaviour - add one ADD test with shift * Operand register extension generator + 64bit variant extension tests - Arm supports the extension of operand e.g. 'add x0, x1, w2, SXTW' will sign-extend the w2 register to 64 bit and after that add the values - test for 64bit variant implemented - need to check the optional imm(shift VM outputs weird values) * Arm64 Zero/Sign extension 32bit variant tests * Implemented SUB instruction - added tests for instruction * Implemented BR instruction - added tests for instruction * Arm64 syscall id register is X8 * Specified call and return instruction ID for implemented instruction - BL Branch link is hinting the function call - RET is hinting the function return * Fixed compilation after merge - new methods added isOperandRegister, getOperandAccess - loadOpTernaryop1op2 probably changed to loadOpBinaryOrTernaryOp1Op2 - made sure all unit tests passed - TODO: implement new conventions from master * Generate pseudoasm instruction when translation routine is not found - Function to generate condition code * Check preconditions in implemented arm64 instructions * Changed register generation to match other modules. * LDR instruction all 3 formats + tests - register - imm - literal (label) * Binaries can now be decompiled - jumpTargetDryRun updated * Generate condition codes for conditional instructions. * ARM64: strb, strh instructions + tests * Arm64: conditional and unconditional branch instruction + tests - removed the generation of conditional code in translate instruction function, this is not necessary because condition is generated in body of given instruction and arm64 support only specific instruction to be conditional. * Arm64: Instruction ret can have optional register operand + test * Arm64: BLR instruction + test * Arm64: CBNZ, CBZ instruction + test * Arm64: TBNZ, TBZ implementation + tests * Arm64: LDR different size variants, sign/zero extend + tests * Arm64: LDPSW instruction + tests - minor warning fix in STR instruction * Arm64: ADC instruction + tests - including flag setting for ADC and ADD instructions - ADDS tests * Arm64: ADCS 32bit tests for flags * Arm64: ADR, ADRP instruction + tests * Arm64: AND, ANDS instruction + tests * Arm64: ASR instruction + tests - ASRV variant * Arm64: LSL, LSR, ROR instructions + tests - all major shifts implemented * Arm64: SUB, SBC flags + tests - changed asserts to exceptions * Arm64: CMP, CMN instructions + tests * Arm64: CSEL instruction + tests * Arm64: CSET, CSETM instruction + tests * Arm64: MUL instruction + tests * Arm64: MADD instruction + tests - 32bit tests for MUL * Arm64: MSUB instruction + tests * Arm64: MNEG instruction + tests * Arm64: NEG, NEGS instruction + tests * Arm64: NGC, NGCS initial implementation + tests - Check the carry flags + add tests * Arm64: SDIV, UDIV instruction + tests * Arm64: Fix correct semantics for SBC and NEG instructions * Arm64: SMADDL, UMADDL instruction + tests * Arm64: UMSUBL, SMSUBL instruction + tests * Arm64: SMNEG, UMNEG instruction + tests * Arm64: UMULL, SMULL, UMULH, SMULH instruction + tests * Arm64: Conditional select operation instruction + tests * Arm64: CINC, CINV, CNEG tests * Arm64: EON, EOR instruction + tests * Arm64: ORN, ORR instruction + tests * Arm64: TST instruction + tests - fixed the AND instruction to set carry and overflow flags to zero * Arm64: EXTR instruction + tests * Arm64: Extend instructions + tests * Arm64: CCMN, CCMP instruction + tests * Arm64: NOP instruction + tests * Arm64: REV, RBIT, CLZ instructions + tests * Arm64: BIC instruction + tests * Arm64: Unpriviledged loads/stores instructions + tests * Arm64: Load/Store exclusive instructions + tests * ARM64: LDAXR instruction variants + tests * Arm64: LDAR instruction variants + tests * Arm64, llvmir-emul: don't lower bitreverse intrinsic - updated tests to check if the correct intrinsic functions was called * Arm64: FP environment + basic unary and binary operations + tests * Arm64: FMIN, FMINNM, FMAX, FMAXNM instruction + tests * Arm64: FCMP, FCCMP, FCVT, {U, S}CVTF instructions + tests * Arm64: FCVTZS, FCVTZU instructions + tests - let's start testing * Arm64, bin2llvmir: Decoder should not analyse stack. * Arm64: MOVK instruction + tests * Arm64: MOVN instructions + tests * Merge master with arm-prep * Architecture: Change arm architectures to account for arm64 -> isArmOrThumb renamed to isArm32OrThumb -> added isArm32 method -> thumb is now set with a flag _thumbFlag * Architecture: Removed the wrong architecture types Now the enum eArch represents only general architecture and all subtypes of architecture are checked to getBitSize() or _thumbFlag. The function isArm() return true for every type of subarchitecture e.g. {arm32, arm64 or thumb} * Arm64: XZR loads zero and discards result when written - Added some instruction IDs to branch types * Arm64: STR and LDR instructions now determine correct register size - For example 'str w0, [sp]' should store only 4bytes to stack pointer * Arm64: Syscall optimalization and detection Replace svc #0 with corresponding syscall decoded from previous assignments. * Arm64: MOVI instructions + tests, Vector and half register Generate Vector registers so in case the pseudo instructions with them as operands is generated we don't crash. For the similar purpose I changed the f16 in ARM64_REG_H* to i16 since half type in not supported and we wan't to be able to at least generate pseudo instructions. * Arm64: STR and LDR tests Those tests target loading and storing floating point values. * Arm64: Removed zero division semantics from llvmir - Zero division is NOW undefined behaviour - This caused problems in modulo idiom detection - Also removed coresponding tests * Arm64: FMOV instruction with immediate values - Correctly handle imm values as operands of this instruction * Revert "Arm64, bin2llvmir: Decoder should not analyse stack." This reverts commit 7b88475. This change caused other tests to fail. * Arm64: Simplified and documented some code - Removed unused code from decoder/arm64.cpp - Fixed insnWrittesPcArm64 to work better - Fixed Cond branch tests * Arm64: Fixed documentation build
Had some issues with cmake on Xenial, so switching to bionic (next LTS) seemed to be the route of least resistance.