Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lah numbers and clean up combinatorics section #39379

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

kwankyu
Copy link
Collaborator

@kwankyu kwankyu commented Jan 25, 2025

According to the Wikipedia article https://en.wikipedia.org/wiki/Lah_number, a formula that computes the Lah number $L(n,k)$, also called the Stirling numbers of the third kind, is

$$\binom{n-1}{k-1}\frac{n!}{k!}$$

but the most efficient form of the formula seems to be

$$\binom{n}{k}\frac{k}{n}\binom{n}{k}(n-k)!$$

contributed by @user202729.

Long time ago, #17161 proposed yet another formula that gives $L(n,k)$ for all integers $n,k$. But that formula is extremely slow when implemented in Sage. Like existing functions for Stirling numbers, we only support the main case where $n$ and $k$ are nonnegative.

After this PR, we may close #17161, #17159, #17123 since we do not intend to support the negative $k$ case for combinatorial functions, which is I think only of minor interest and in controversy.

I made some cosmetic changes to the file src/sage/combinat/combinat.py, mostly for user friendliness.

While at it, I ended up making extensive changes to index pages of the combinatorics section, for tidiness. In particular,

  • removed some of "forever" TODOs.
  • removed excessive cross references.
  • uniformized capitalization.

and also fixed #17421.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented Jan 25, 2025

Documentation preview for this PR (built with commit b51a2c4; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@user202729
Copy link
Contributor

While we're at it, maybe binomial(n, k)*(n-k).factorial() would be faster than n.factorial()//k.factorial()?

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Not really, at least on my machine:

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
....: def lah_numbers5(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = binomial(n, k)*(n-k).factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in rang
....: e(100)])')
25 loops, best of 3: 9.74 ms per loop
sage: timeit('T2 = matrix([[lah_numbers5(n,k) for n in range(100)] for k in rang
....: e(100)])')
5 loops, best of 3: 116 ms per loop
sage: T1 = matrix([[lah_numbers4(n,k) for k in range(10)] for n in range(10)])
sage: T2 = matrix([[lah_numbers5(n,k) for k in range(10)] for n in range(10)])
sage: T1 == T2
True

@user202729
Copy link
Contributor

Apparently there's massive overhead in binomial(n, k) (symbolic version) over n.binomial(k) for small input. I only tested for large input.

Try this one

    a = n.binomial(k)
    return a * k // n * a * (n-k).factorial() 

(note that n.binomial(k) * k / n == (n-1).binomial(k-1) so the division is exact)

In my experiment, it has minimal overhead for small n and k (as in your matrix) and a few times faster for e.g. lah_numbers(10^6, 10^5) .

It looks impossible to have a version that calls binomial()/factorial() only twice and avoid division though. Best division-free version I got is (n-1).binomial(k-1) * n.binomial(k) * (n-k).factorial() but it takes 3 calls (slower).

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Voilà!

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: def lah_numbers6(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.binomial(k)
....:     return a * k // n * a * (n-k).factorial()
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 9.36 ms per loop
sage: timeit('T1 = matrix([[lah_numbers6(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 11.1 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,10000+k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 1.91 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,10000+k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 20.7 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 10.7 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 2.19 s per loop

Nice!

@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from 260df5f to 42ef17b Compare January 26, 2025 22:23
@kwankyu kwankyu marked this pull request as ready for review January 26, 2025 23:41
@kwankyu kwankyu changed the title Add Lah numbers Add Lah numbers and clean up combinatorics section Jan 26, 2025
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from b5d05e2 to 44c3dc1 Compare January 31, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants