-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need unittests directly in C or C++ #29
Comments
In recent years, or at least this matches my own interests, shapelib standalone has mostly be a by-product of GDAL shapelib internal copy than a project moving by itself. I'm not sure if we want to invest too much in adding and maintaining more code (tests) to shapelib standalone, whereas it is already quite extensively tested through GDAL. |
You've said this before, so it's worth a moment for me to lay out the why I'm doing this. First, the users that I see:
It is great that GDAL has extensive testing of the shapefile driver with >130 tests ( Being that high level, I have a hard time translating what I see in those to knowing what shapelib itself is supposed to be doing when I read shapelib code. Most of its behavior is documented only implicitly in how it's used by GDAL. That's very hard for me to follow. I plan for the GDAL tree I manage to split off shapelib as a separate thing where is just uses the separate gdal:port_lib target that I have. I'd like to be able to see tests that directly correspond to the files in question without all the intermediate code of GDAL and SWIG/Python between the tests and the actual code. Code like shapelib is going to be with us for a long time despite so many in the community feeling frustrated by the limitations it imposes. The longer shapelib goes on in the state that it's in, the more drag it places on the community. shapelib keeps coming up again and again for me and without cleanup, I don't have much hope of pushing it out of my awareness back to where it belongs... just the thing we have to use (directly or indirectly through gdal). I know nobody else is interested in getting shapelib in shape, but knowing that it's low level behavior is documented and tested will help me sleep better. I'm sure it's got plenty of bugs waiting to be found and when they are, code that is cleaner, documented, and directly tested will be much easier to work on for people who don't already know it. Benefits of removing the tech debt from shapelib:
And for my own internal use case: I have a huge number of targets that bring in GDAL just for some small part of it. We build (almost) all binaries statically to be hermetic and hardened. We often bump into the maximum binary ceiling (which is large). Being able to take the sizable chunk of shapelib only users and have them only link in port_lib and shapelib will be a huge win. Developers should be able to count on the libraries that are below them when they are writing code that depends on them. The tests in GDAL of shape don't make me comfortable that shapelib is actually solid and not going to someday have unexpected behavior changes in corner cases that someone probably counts on somewhere. shapelib is one thing that I have a chance of getting to a reasonable level of tech debt such that I think we will be able to mostly ignore it with confidence for another decade or two. I look at GRIB and I just feel hopeless. I poked at libtiff and two of the core developers seem apposed to the kind of cleanup that makes me confident about code functioning as intended and being maintainable by the community as a whole (as apposed to the specialists who have to deeply understand these libs now). I took a large swing at MB-System, but that didn't change the behavior of any of the coders working on the system. They've only deleted some of the tests I've added and added none of their own. But shapelib is tractable for me with the little bits of time I get between normal work that takes big continuous blocks of time and my kids jumping on me destroying my ability to think at any sort level beyond "Room on the Broom" or "Baby Shark". Shapelib is still at state of broken window syndrome in my opinion. |
FWIW, I just landed here after hitting a crash in R's built-in DBF reader, which turned out to be a vendored copy of shapelib. PostGIS also had a vendored copy for the |
It would be good to have a list of the key places where copies exist. |
I've been running a dbf fuzzer for about 7000 core hours so far. These 179 files might be helpful when writing unit tests to trigger particular code paths. |
My stripped down CMakeLists.txt works. It will work with C++-11 or C++-14 if the dbfopen_test.cc doesn't use cmake_minimum_required(VERSION 3.10)
set (PROJECT_VERSION_MAJOR 1)
set (PROJECT_VERSION_MINOR 5)
set (PROJECT_VERSION_PATCH 0)
set (PROJECT_VERSION
"${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}.${PROJECT_VERSION_PATCH}")
project(
shapelib
LANGUAGES C CXX
VERSION ${PROJECT_VERSION})
set(CMAKE_C_FLAGS "-Wall -Wextra -O2")
set(CMAKE_CXX_FLAGS "-Wall -Wextra -O2")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)
set(CMAKE_CXX_EXTENSIONS False)
set(PACKAGE shp)
set(lib_SRC
dbfopen.c
safileio.c
sbnsearch.c
shpopen.c
shptree.c
)
add_library(${PACKAGE} ${lib_SRC})
enable_testing()
find_package(Catch2 REQUIRED)
add_executable(dbfopen_test dbfopen_test.cc)
target_link_libraries(dbfopen_test Catch2::Catch2 shp)
include(CTest)
include(Catch)
catch_discover_tests(dbfopen_test) With just these files in the tree:
|
All 174 files behave as expected: SECTION("Open bad DBF files")
{
for (const auto& filename : fs::directory_iterator(fs::path{ "corpus-dbf" }))
{
const auto handle = DBFOpen(filename.path().string().c_str(), "rb");
REQUIRE(handle == nullptr);
DBFClose(handle);
}
} |
@schwehr Note, that I first went for the Catch2 unit testing framework but switched to GTest afterwards. |
Thanks for getting that into the code base! |
The current testing setup only runs command line programs and compares their results to golden files. shapelib needs unit tests that direct call the C functions with a much more diverse range of inputs. e.g. exercising various error conditions. There are lots of frameworks that would work well for this.
I'm thinking of going with Catch2. It's got a (slower) option to use a single header and source file to get started. And I've wanted to give it a try. Having these tests in C++ (probably >= C++17) will mean that some platforms will only be able to use the original shell script based testing, but that should be okay as these tests will exercise the code on at least the 4 configurations currently setup for CI in the project.
See the C and C++ sections in Wikipedia's C and C++ Frameworks in List of unit testing frameworks. Probably anything reasonably maintained and open source would be fine. Some of the options are:
Example starter based test in
dbfopen_test.cc
. Apologies for code that isn't totally clean and SetContents isn't particularly good.catch2:
Almost the same thing written with GoogleTest:
The text was updated successfully, but these errors were encountered: