Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double precision #895

Merged
merged 32 commits into from
Sep 3, 2024
Merged

Double precision #895

merged 32 commits into from
Sep 3, 2024

Conversation

pca006132
Copy link
Collaborator

Closes #542. This is mostly just find-and-replace...

Basically, everything except MeshGL now uses 64 bit floating point numbers, so things should be more precise, and we can improve our precision bound.

@elalish There are a few tests that are failing, and more tests fail when I try to reduce kTolerance. Maybe you can take a look into it?

@pca006132 pca006132 requested a review from elalish August 11, 2024 08:25
@pca006132
Copy link
Collaborator Author

One issue is that precision related calculation in RelatedGL may be messed up, because precision with MeshGL is different comparing with Manifold (32bit vs 64bit floats).

@elalish
Copy link
Owner

elalish commented Aug 11, 2024

Great, thanks! Those failures don't look too bad - and I just figured out I regressed the Smooth.SineSurface test some time ago, which is why it's flaky now - I'll fix that separately.

@pca006132
Copy link
Collaborator Author

Great, thanks! Those failures don't look too bad - and I just figured out I regressed the Smooth.SineSurface test some time ago, which is why it's flaky now - I'll fix that separately.

Thanks! The Smooth.SineSurface issue is the most troubling one, nice to know that it is not caused by the changes here!

When we lower kTolerance, there will be more and more errors related to number of triangles and Expected: (volume) <= (area * out.Precision()) which is related to MeshGL precision. I think we should modify the MeshGL precision value to max of Manifold precision and bounding box scale * 1e-5, to account for rounding when we convert doubles to floats?

At some point, iirc kTolerance = 1e-9, we start to get MatchesTriNormals error for GyroidModule. Considering it is SDF, not sure if it is some SDF issue after my changes. For some reason, changing kTolerance to 1e-12 fixes it again, so maybe there is some simplification issue there?

@pca006132
Copy link
Collaborator Author

I tried checking the errors manually. The produced mesh looks fine, so I think they are mostly caused by our tests depending on the old precision values. However, for Boolean.Coplanar, I am not sure why it is producing a mesh like this:

image

If we overlay out and cylinder2, we can see that it is somehow creating new vertices out of nowhere:

image

Also, there is a weird error for Polygon.BigSponge:

polygon_test.cpp:75: Failure
Expected equality of these values:
  triangles.size()
    Which is: 1767
  expectedNumTri
    Which is: 1771
Turn 180

This means that the number of triangles changed when it is rotated?

Here are the patch that I use to fix (silence) other errors while playing with higher precision:

diff --git a/src/utilities/include/public.h b/src/utilities/include/public.h
index 5445b7ae..d180faef 100644
--- a/src/utilities/include/public.h
+++ b/src/utilities/include/public.h
@@ -41,7 +41,7 @@ namespace manifold {
   return n;
 }
 
-constexpr double kTolerance = 1e-5;
+constexpr double kTolerance = 1e-12;
 
 /** @defgroup Math data structure definitions
  *  @brief Abstract away from glm.
diff --git a/test/hull_test.cpp b/test/hull_test.cpp
index f64ec4ce..6a72e46b 100644
--- a/test/hull_test.cpp
+++ b/test/hull_test.cpp
@@ -78,7 +78,7 @@ TEST(Hull, Tictac) {
   }
 #endif
 
-  EXPECT_EQ(sphere.NumVert() + tictacSeg, tictac.NumVert());
+  EXPECT_NEAR(sphere.NumVert() + tictacSeg, tictac.NumVert(), 2);
 }
 
 #ifdef MANIFOLD_EXPORT
diff --git a/test/samples_test.cpp b/test/samples_test.cpp
index ccc6ef3f..e536a5f7 100644
--- a/test/samples_test.cpp
+++ b/test/samples_test.cpp
@@ -182,7 +182,7 @@ TEST(Samples, Bracelet) {
   CheckGL(bracelet);
 
   CrossSection projection(bracelet.Project());
-  projection = projection.Simplify(bracelet.Precision());
+  projection = projection.Simplify(bracelet.BoundingBox().Scale() * 1e-5);
   Rect rect = projection.Bounds();
   Box box = bracelet.BoundingBox();
   EXPECT_FLOAT_EQ(rect.min.x, box.min.x);
diff --git a/test/smooth_test.cpp b/test/smooth_test.cpp
index e65f1fd9..c0a330a2 100644
--- a/test/smooth_test.cpp
+++ b/test/smooth_test.cpp
@@ -327,7 +327,7 @@ TEST(Smooth, Torus) {
 #endif
 }
 
-TEST(Smooth, SineSurface) {
+TEST(Smooth, DISABLED_SineSurface) {
   MeshGL surface = MeshGL::LevelSet(
       [](vec3 p) {
         double mid = std::sin(p.x) + std::sin(p.y);
diff --git a/test/test_main.cpp b/test/test_main.cpp
index 374d15e6..c237c57d 100644
--- a/test/test_main.cpp
+++ b/test/test_main.cpp
@@ -344,6 +344,13 @@ void RelatedGL(const Manifold& out, const std::vector<MeshGL>& originals,
     }
     ASSERT_LT(i, originals.size());
     const MeshGL& inMesh = originals[i];
+
+    // to compute old precision
+    Box b;
+    for (const auto &v : out.GetMesh().vertPos) {
+      b.Union(v);
+    }
+
     for (uint32_t tri = output.runIndex[run] / 3;
          tri < output.runIndex[run + 1] / 3; ++tri) {
       if (!output.faceID.empty()) {
@@ -368,6 +375,7 @@ void RelatedGL(const Manifold& out, const std::vector<MeshGL>& originals,
         pos[3] = 1;
         inTriPos[j] = transform * pos;
       }
+      const auto actualPrecision = b.Scale() * 1e-5;
       vec3 outNormal =
           glm::cross(outTriPos[1] - outTriPos[0], outTriPos[2] - outTriPos[0]);
       vec3 inNormal =
@@ -382,7 +390,7 @@ void RelatedGL(const Manifold& out, const std::vector<MeshGL>& originals,
         for (int k : {0, 1, 2}) edges[k] = inTriPos[k] - outTriPos[j];
         const double volume =
             glm::dot(edges[0], glm::cross(edges[1], edges[2]));
-        ASSERT_LE(volume, area * out.Precision());
+        ASSERT_LE(volume, area * actualPrecision);
 
         if (checkNormals) {
           vec3 normal;
@@ -405,7 +413,7 @@ void RelatedGL(const Manifold& out, const std::vector<MeshGL>& originals,
             const double volumeP =
                 glm::dot(edgesP[0], glm::cross(edgesP[1], edgesP[2]));
 
-            ASSERT_LE(volumeP, area * out.Precision());
+            ASSERT_LE(volumeP, area * actualPrecision);
           }
         }
       }

@elalish
Copy link
Owner

elalish commented Aug 13, 2024

I think we should modify the MeshGL precision value to max of Manifold precision and bounding box scale * 1e-5, to account for rounding when we convert doubles to floats?

That sounds reasonable.

I'll take a look at these outputs myself and get back to you.

@elalish
Copy link
Owner

elalish commented Aug 13, 2024

I'm not sure what the trouble with Boolean.Coplanar is - it seems to be passing. After merging my fix it seems the only trouble is BigSponge missing 4 triangles when upside-down. Odd, but the triangulation is still valid - CCW within precision. I need to think about that some more. And there is Hull.Tictac which is flakily losing one vertex. That's also a bit strange...

Overall though, I'd say this is going rather well.

// Vec<vec3> pointCloudVec=vertPos;
QuickHull qh(vertPos);
ConvexHull hull = qh.getConvexHullAsMesh(vertPos, false);
vertPos_ = std::move(hull.vertices);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @Kushal-Shah-03 I did some cleanup on your Hull here as part of the move to doubles. Hopefully we can merge this soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, that's quite exciting, if you face issues with passing the hull tests ping me, I'll try to see if I can help.

@@ -965,7 +963,7 @@ std::vector<glm::ivec3> TriangulateIdx(const PolygonsIdx &polys,
triangles = triangulator.Triangulate();
updatedPrecision = triangulator.GetPrecision();
}
#if MANIFOLD_EXCEPTION
#ifdef MANIFOLD_EXCEPTIONS
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pca006132 it seems we had a typo here that was keeping us from checking if our triangulation was valid. That's why we got the wrong number of triangles - it actually has a few CW triangles, which is what I was expecting to get caught first. Now I'm working on figuring out the triangulator error this uncovered.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! interesting!

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still getting some weird issues on the CI - want to try and resolve those while leaving kTolerance unchanged just so we can land this PR? Then we should probably do a follow-up where we more carefully drop kTolerance and update tests accordingly.

@@ -65,7 +65,7 @@ bool isMeshConvex(manifold::Manifold hullManifold, double epsilon = 0.0000001) {
TEST(Hull, Tictac) {
const double tictacRad = 100;
const double tictacHeight = 500;
const int tictacSeg = 1000;
const int tictacSeg = 500;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't help the flaky missing one vertex issue - very weird.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I expected that this would work. I'll try to see if I can recreate it and find out why (also I do remember experimenting with this earlier as well at the start of the GSoC project, when I pointed out that the values were off by 1 or 2, we had discussed that it could be caused due to the precision errors, and it's possible that the algorithm itself isn't leading to the failure. Even then I was able to cause a missing vertex or two in both cases increasing and reducing the tictacSegs) opencax/GSoC#89 (comment)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - it'd be interesting first just to know which vertex is missing - I wonder if it's lined up with an axis or something? Once you know it's index, hopefully it won't be too hard to see where the algorithm is losing it.

@@ -836,7 +836,7 @@ TEST(Polygon, Sponge3a) {
TestPoly(polys, 454);
}

TEST(Polygon, BigSponge) {
TEST(Polygon, DISABLED_BigSponge) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just disabled this for now since it's an existing triangulator issue - I also added a more compact repro of it in Polygon.SquareHoles2.

@pca006132
Copy link
Collaborator Author

Apparently, zebra is also having issue and SDF.blobs is too large for wasm?

@elalish
Copy link
Owner

elalish commented Aug 24, 2024

I fixed a triangulator bug and I've got all the tests passing locally. I'm a bit baffled by what the CI is showing - some won't even show me the logs on github (?) and the zebra seems to have failed in a way gtest couldn't catch? I've never seen that before - @pca006132 are you getting any test failures locally now?

@pca006132
Copy link
Collaborator Author

No, I can't reproduce any error. Maybe we should also disable the hull test that is flaky for now and see if it helps?

the zebra seems to have failed in a way gtest couldn't catch

What do you mean by this?

@elalish
Copy link
Owner

elalish commented Sep 1, 2024

I'm impressed - this is actually running our test suite faster on my M1 Mac - 38s now vs 44 on master.

@@ -192,7 +192,8 @@ MeshGL Manifold::GetMeshGL(ivec3 normalIdx) const {
!isOriginal && glm::all(glm::greaterThan(normalIdx, ivec3(2)));

MeshGL out;
out.precision = Precision();
out.precision = std::max(Precision(), std::numeric_limits<float>::epsilon() *
BoundingBox().Scale());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my update to make the MeshGL tests pass - just account for the additional rounding error. Now we're back to kTolerance = 1e-12 and everything passes on my machine.

@elalish
Copy link
Owner

elalish commented Sep 2, 2024

@pca006132 Okay, I think this PR is looking pretty solid now. The only trouble is our Windows CI, which is giving that odd 127 error for Zebra and Zebra3. Super weird because that's some kind of file not found error, though maybe just a segfault somewhere? Still, I can't imagine how it could segfault, and even it if it did, our code and GTest ought to be able to catch it and give a clearer error than that. Any ideas? I'm going to bed.

@pca006132
Copy link
Collaborator Author

No idea what that is, maybe try to enable address sanitizer for that in the windows CI?

@pca006132 pca006132 force-pushed the double-precision branch 4 times, most recently from cc8d8a7 to 3ffc865 Compare September 2, 2024 11:31
@pca006132
Copy link
Collaborator Author

pca006132 commented Sep 2, 2024

Meh, I gave up. I think we need someone with a Windows pc to try and see.

@weianweigan @reinux can you help us to try this and see if you can reproduce the issue locally?

@Kushal-Shah-03
Copy link
Contributor

I have windows, I can try to set it up?

@pca006132
Copy link
Collaborator Author

Sure, that will be great!

@Kushal-Shah-03
Copy link
Contributor

I'm sorry it took so long, I didn't have c++ and make set up on windows. I got the code to run and the error still persists. It doesn't show any error code in windows tho, I was not sure how to exactly get the error code, so I tried something called ERRORLEVEL, the value of which was -1073741571, I am not sure if it's relevant but I just thought it's worth sharing.

I also tried running the tests with -v flag. The program ended before printing any verbose output for the Zebra3 test. Based on these do you have any suggestion on where I should start looking? Otherwise I just plan to add print statements to find out where the error occurs (unless there's a way I can run gdb on windows I don't have experience with working on projects on windows).

@pca006132
Copy link
Collaborator Author

Try

cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_CXX_FLAGS="/fsanitize=address /EHsc" -DBUILD_SHARED_LIBS=OFF -DMANIFOLD_DEBUG=ON -DMANIFOLD_PAR=${{matrix.parallel_backend}} -A x64 -B build
cd build
cmake --build . --target ALL_BUILD --config RelWithDebInfo
cd build/bin/RelWithDebInfo
./manifold_test.exe

and see what you get

@@ -184,7 +184,7 @@ jobs:
shell: bash
run: |
cd build/bin/Release
./manifold_test.exe
./manifold_test.exe --gtest_filter=-Polygon.Zebra*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really want to just ignore these tests like this, but I'm hoping to at least verify these are the only tests making windows inexplicably die.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pca006132 it looks like this is indeed going to get our CI to finally pass. I'm considering just merging this PR while we work out what the Windows problem is. Mostly because I'd like to be able to keep working on PRs without creating an obscene number of merge conflicts. What do you think?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, since you're probably asleep, I've open #910 to track and I'm merging this. Thanks - I'm really glad to finally be on double precision!

@Kushal-Shah-03
Copy link
Contributor

image
I am getting this error after running these commands and executing manifold_test, I tried installing clang and building again. But that didn't resolve it any ideas?

@elalish elalish mentioned this pull request Sep 3, 2024
@elalish
Copy link
Owner

elalish commented Sep 3, 2024

@Kushal-Shah-03 Thanks - that's definitely a more specific error, so appreciated! I just opened #910 to track this, so I'm gong to merge this PR. If you have ideas, feel free to investigate, but if not, go ahead and go back to your original tasks.

@elalish
Copy link
Owner

elalish commented Sep 3, 2024

@Kushal-Shah-03 oh, I think asan refers to address sanitizer, so that may just be a problem related to -DCMAKE_CXX_FLAGS="/fsanitize=address /EHsc"

@elalish elalish merged commit 492c3a8 into elalish:master Sep 3, 2024
22 checks passed
@Kushal-Shah-03
Copy link
Contributor

@Kushal-Shah-03 oh, I think asan refers to address sanitizer, so that may just be a problem related to -DCMAKE_CXX_FLAGS="/fsanitize=address /EHsc"

Yeah, the code didn't even start execution before the error came up, but how would I fix it?

@elalish
Copy link
Owner

elalish commented Sep 3, 2024

Maybe just remove that flag? Or Google about building on windows with address sanitizer enabled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change API changes for major versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow double precision
3 participants