Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tools/symbol-check.py #1135

Closed
wants to merge 1 commit into from
Closed

Conversation

hebasto
Copy link
Member

@hebasto hebasto commented Aug 22, 2022

The new tools/symbol-check.py aims to ensure that only expected symbols are exported in the secp256k1 shared library.

The script itself stems from Bitcoin Core's contrib/devtools/symbol-check.py and bitcoin/bitcoin#25020. It uses the LIEF library.

Useful to ensure that build system changes (including #1113) will not introduce any unexpected regressions.

@real-or-random
Copy link
Contributor

I think checking that not too many symbols are exported could be interesting for us.

Checking libraries not so much, we anyway don't have dependencies, so all linkage should be introduced by the compiler. Or what do you think?

@hebasto
Copy link
Member Author

hebasto commented Aug 23, 2022

Checking libraries not so much, we anyway don't have dependencies

Cleaned up, improved, undrafted.

@hebasto hebasto marked this pull request as ready for review August 23, 2022 13:49
@hebasto hebasto changed the title [WIP] Add contrib/symbol-check.py Add contrib/symbol-check.py Aug 23, 2022
@real-or-random
Copy link
Contributor

real-or-random commented Aug 23, 2022

Now that I think about this, I wonder why that would be necessary. I've never seen any libraries doing this. I think in our case it could be helpful to prevent mistakes where we forget to make an internal function static and then it would be exported?

But on the other hand, it's yet another file that needs to be maintained. In particular the list of function must be kept in sync with the include files. This violates the DRY principle. Do you think the list could be generated on-the-fly, maybe just using some clever grep SECP256K1_API magic? A more elaborate version would be to use clang-query or similar, see #833, but I think this would be overkill for now.

Weak Concept ACK because it has potential to prevent mistakes where we forget static

@hebasto
Copy link
Member Author

hebasto commented Aug 23, 2022

I wonder why that would be necessary.

Because quality matters :)
For example:

However, generally it is not recommended to export all the symbols in modules. Programmers can just export symbols as needed. This does not only benefit library security, but also benefits dynamic linking time.

I've never seen any libraries doing this.

bitcoin/bitcoin#25020 (comment)

I think in our case it could be helpful to prevent mistakes where we forget to make an internal function static and then it would be exported?

The secp256k1_pre_g is not static (in most cases). Nevertheless, its export must be avoided.

Other use cases:

  • it has helped to avoid regression in build: Add CMake-based build system #1113
  • a sanity check of linker flags when building without any build system
  • when a build system being changed or modified
  • with future use of CMake, its new version could potentially introduce a regression into its linking code

maybe just using some clever grep SECP256K1_API magic?

👍

@sipa
Copy link
Contributor

sipa commented Nov 21, 2022

maybe just using some clever grep SECP256K1_API magic?

That would be neat.

@hebasto
Copy link
Member Author

hebasto commented Nov 30, 2022

Updated 8ae0d51 -> 766964f (pr1135.02 -> pr1135.03, diff).

maybe just using some clever grep SECP256K1_API magic?

That would be neat.

Done.

@real-or-random
Copy link
Contributor

This fails for me with

> ./contrib/symbol-check.py ./.libs/libsecp256k1.so
./.libs/libsecp256k1.so: symbol memcpy from unsupported version GLIBC_2.14(4)
./.libs/libsecp256k1.so: failed IMPORTED_SYMBOLS

@hebasto
Copy link
Member Author

hebasto commented Dec 2, 2022

This fails for me with

What system/compiler on?

UPD. I assume, it is a clang...

@sipa
Copy link
Contributor

sipa commented Dec 2, 2022

llibsecp256k1, unlike Bitcoin Core, has no affordances to make sure the produced builds are compatible with certain, specific, version of glibc on the user's system.

I think the max version checks should just be dropped here.

@hebasto
Copy link
Member Author

hebasto commented Dec 2, 2022

Updated 766964f -> 6a97f72 (pr1135.03 -> pr1135.04, diff).

I think the max version checks should just be dropped here.

Done.

@real-or-random
Copy link
Contributor

llibsecp256k1, unlike Bitcoin Core, has no affordances to make sure the produced builds are compatible with certain, specific, version of glibc on the user's system.

Ok yes, I was wondering what the purpose of the max version is... :)

@real-or-random
Copy link
Contributor

Bikeshedding: I believe contrib is the wrong directory for this, see for example this attempt at a definition: https://softwareengineering.stackexchange.com/questions/252053/whats-in-the-contrib-folder#339925.

I know, Bitcoin Core uses a different definition. My point is that the current contents of the folder fit the definition on stackexchange, so we shouldn't mix up the two.

I think util or tools is good a name. I wouldn't mind putting it in the root either.

@hebasto
Copy link
Member Author

hebasto commented Dec 2, 2022

Updated 6a97f72 -> 6813bee (pr1135.04 -> pr1135.05, diff).

I think util or tools is good a name. I wouldn't mind putting it in the root either.

Moved into the tools directory.

@fanquake
Copy link
Member

fanquake commented Dec 2, 2022

It's not clear to me that adding this script is worth it (especially as it continue to shrink). We seem to be adding a new language dependency, and third party library dependency, to run grep + objdump? I might have thought a bash script would be a better fit for libsecp256k1, especially since this is basically just a sanity check that will live in the CI (as the API rarely changes, and I don't think anyone will run this locally)?

Is there a reason this doesn't support macOS shared libs? LIEF supports them as far as I'm aware, and it seems odd to add a platform agnostic binary parser as a dependency, and then not support all libraries that are being built.

@real-or-random
Copy link
Contributor

It's not clear to me that adding this script is worth it (especially as it continue to shrink). We seem to be adding a new language dependency, and third party library dependency, to run grep + objdump? I might have thought a bash script would be a better fit for libsecp256k1, especially since this is basically just a sanity check that will live in the CI

I think @hebasto convinced me above that this tool has >0 benefit, but indeed, it introduces a maintenance burden and doing the same with objdump instead of Python+LIEF might reduce this burden (but I don't know -- for example I don't know if objdump exists on macOS).

Note that we already have python as a CI dependency for our sage scripts.

@real-or-random real-or-random changed the title Add contrib/symbol-check.py Add tools/symbol-check.py Dec 2, 2022
@fanquake
Copy link
Member

fanquake commented Dec 2, 2022

(but I don't know -- for example I don't know if objdump exists on macOS).

It does, but at the moment, I guess that doesn't matter, given the script doesn't check macOS libs.

Would you expect an API issue to be present in one shared library, but not all others? We might be able to get away with just sanity checking an ELF lib, as I assume if there's an issue that you'd want to catch, present there, it'll be present everywhere.

@hebasto
Copy link
Member Author

hebasto commented Dec 4, 2022

Is there a reason this doesn't support macOS shared libs?

Updated to support macOS libs. LIEF v0.13.0 (not released yet) is required.

Also: https://lief-project.github.io/blog/2022-05-08-macho/

@fanquake
Copy link
Member

fanquake commented Dec 4, 2022

LIEF v0.13.0 (not released yet) is required.

Looks like you could do something like this to check that the expected exports are present?
Works with LIEF 0.12.3 (latest release):

diff --git a/tools/symbol-check.py b/tools/symbol-check.py
index 8206b95..7237158 100755
--- a/tools/symbol-check.py
+++ b/tools/symbol-check.py
@@ -59,12 +59,10 @@ def check_MACHO_exported_functions(library, expected_exports) -> bool:
     ok: bool = True
     macho_lib: lief.MACHO.Binary = library.concrete
 
-    for function in macho_lib.exported_functions:
-        name: str = function.name[1:]
-        if name in expected_exports:
-            continue
-        print(f'{filename}: export of function {name} not expected')
-        ok = False
+    for name in expected_exports:
+        if not macho_lib.has_symbol(name):
+            print(f'{filename}: export of function {name} missing')
+            ok = False

@hebasto
Copy link
Member Author

hebasto commented Dec 13, 2022

Updated 6e6aa49 -> cc7933d (pr1135.06 -> pr1135.07):

@real-or-random real-or-random mentioned this pull request Dec 19, 2022
4 tasks
@real-or-random
Copy link
Contributor

We've discussed this at length in the meeting today, see https://gnusha.org/secp256k1/2022-12-19.log for details.

Conclusion:

  • @hebasto will try to incorporate the suggestion from @fanquake to make this work on current LIEF functions
  • @hebasto will add commits that call the script in CI, and maybe add builds with -fvisibility=default

@hebasto
Copy link
Member Author

hebasto commented Dec 20, 2022

  • @hebasto will try to incorporate the suggestion from @fanquake to make this work on current LIEF functions

LIEF v0.13.0 (not released yet) is required.

Looks like you could do something like this to check that the expected exports are present? Works with LIEF 0.12.3 (latest release):

diff --git a/tools/symbol-check.py b/tools/symbol-check.py
index 8206b95..7237158 100755
--- a/tools/symbol-check.py
+++ b/tools/symbol-check.py
@@ -59,12 +59,10 @@ def check_MACHO_exported_functions(library, expected_exports) -> bool:
     ok: bool = True
     macho_lib: lief.MACHO.Binary = library.concrete
 
-    for function in macho_lib.exported_functions:
-        name: str = function.name[1:]
-        if name in expected_exports:
-            continue
-        print(f'{filename}: export of function {name} not expected')
-        ok = False
+    for name in expected_exports:
+        if not macho_lib.has_symbol(name):
+            print(f'{filename}: export of function {name} missing')
+            ok = False

As the suggested approach changes the check logic, the script fails if any of modules are disabled. For example, the "recovery" module is disabled by default, and script ends with:

.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_parse_compact missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_convert missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_serialize_compact missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_sign_recoverable missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recover missing
.libs/libsecp256k1.1.dylib: failed EXPORTED_FUNCTIONS

Making this PR a draft for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants