Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Levenshtein Distance Computation (Remove Duplicates) #315

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

devalexandre
Copy link

@devalexandre devalexandre commented Dec 31, 2024

I have:

Description of changes

Fixes #314

This pull request provides optimizations for three benchmarks:

  1. Loops

    • Consolidates repeated operations to reduce total iterations and computation time.
  2. Fibonacci

    • Replaces naive recursion with an iterative or formula-based approach, eliminating exponential recursive calls.
  3. Levenshtein

    • Removes duplicate distance calculations by computing each pair of strings only once (instead of both ((i, j)) and ((j, i))).

Rationale

  • Loops: Fewer iterations and more streamlined logic cut down overall execution time.
  • Fibonacci: Iterative or formula-based methods run in (O(n)) instead of (O(2^n)) for naive recursion.
  • Levenshtein: The space-optimized Wagner-Fischer algorithm remains the same, but we halve the total number of distance computations by avoiding symmetrical checks.

Benchmarks

  • Loops: Preliminary tests show a measurable reduction in CPU usage when redundant operations are removed.
  • Fibonacci: Large input values now compute much faster, with complexity dropping from exponential to linear time.
  • Levenshtein: For 1,000 sample strings (average length ~50), total computation time was reduced by approximately 50% through the removal of redundant pair checks.

Additional Notes

  • The original correctness of results is preserved for each benchmark.
  • No concurrency changes were introduced in these optimizations.

Thank you for reviewing this PR! If there are any questions or feedback, please let me know.

@PEZ
Copy link
Collaborator

PEZ commented Dec 31, 2024

Hi, thanks for contributing! 🙏

I'm not sure what

Consolidates repeated operations to reduce total iterations and computation time.

Means, but it sounds like it dodges the work that all languages are required to do.

The changes to the fibonacci and levenshtein code are definitively non-compliant. The fibonacci is by intention naïve, and for levenshtein we so far require the redundant checks.

Also, we do want benchmark run.sh output from before and after the change to be attached to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize code in golang
2 participants