Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pure function annotation #509

Merged
merged 1 commit into from
Mar 12, 2021
Merged

Remove pure function annotation #509

merged 1 commit into from
Mar 12, 2021

Conversation

mohamed82008
Copy link
Member

@mohamed82008 mohamed82008 commented Mar 5, 2021

According to conventional Julia wisdom, the use of @pure is a reserved right to Jameson and Jameson only. According to the docs:

A pure function can only depend on immutable information. This also means a @pure function cannot use any global mutable state, including generic functions. Calls to generic functions depend on method tables which are mutable global state. Use with caution, incorrect @pure annotation of a function may introduce hard to identify bugs. Double check for calls to generic functions. This macro is intended for internal compiler use and may be subject to changes.

This assumption seems to be violated here. The use of @pure can also lead to peculiar compiler failures when normal runtime errors such as stack overflow are encountered. This bit me recently pretty hard.

@KristofferC
Copy link
Collaborator

KristofferC commented Mar 5, 2021

Do you have an example of the stackoverflow problem?

@mohamed82008
Copy link
Member Author

Do you have an example of the stackoverflow problem?

It's my fault. But the error looks interesting when using @pure.

using ForwardDiff: ForwardDiff, Tag

struct MyTag end
ForwardDiff.:(::Type{Tag{F1,V1}}, ::Type{Tag{MyTag,V2}}) where {F1, V1, V2} = true
# Uncomment the next line to fix the error
#ForwardDiff.:≺(::Type{Tag{MyTag,V2}}, ::Type{Tag{F1,V1}}) where {F1, V1, V2} = false

dual1 = ForwardDiff.GradientConfig(MyTag(), [1.0]).duals[1]
dual2 = ForwardDiff.GradientConfig(sum, [1.0]).duals[1]

promote_type(typeof(dual1), typeof(dual2))

@andreasnoack
Copy link
Member

I guess the assumption here is that is not extended.

@KristofferC
Copy link
Collaborator

So is there really a problem here? If you overload an internal function like this, well, caveat emptor.

@mohamed82008
Copy link
Member Author

There is a problem that sometimes you do need to overload this internal function to specify order between custom tags. So the purity assumption is violated in a valid use case.

@mohamed82008
Copy link
Member Author

Are there any objections to this?

@mohamed82008
Copy link
Member Author

I will wait 2 more days then merge and release if no one objects.

@KristofferC
Copy link
Collaborator

Presumably the pure was there for a reason. Did you check that no longer applies?

@mohamed82008
Copy link
Member Author

The pure seems to have been introduced in #247 back in 2017 as part of a big change to tackle perturbation confusion. It's hard to guess the intentions of @simonbyrne or @jrevels back then especially given that it was a very different Julia version and compiler back then.

@KristofferC
Copy link
Collaborator

So you are saying that the compiler has improved enough that the @pure is not needed? That might very well be possible, but at least some effort to confirm this would be beneficial, in my opinion.

@mohamed82008
Copy link
Member Author

So you are saying that the compiler has improved enough that the @pure is not needed? That might very well be possible, but at least some effort to confirm this would be beneficial, in my opinion.

I don't know if the compiler has improved enough that pure is not needed because I don't know why it was needed to begin with. There was no justification provided back then. Julia 0.6 was used back then. If you have a use case that could have benefited from @pure in Julia 0.6, I would be happy to check it in Julia 1.5/1.6 to see if @pure makes any difference at all.

@jrevels
Copy link
Member

jrevels commented Mar 8, 2021

it's been a very long time since I looked at a ForwardDiff PR 😁

If you have a use case that could have benefited from @pure in Julia 0.6, I would be happy to check it in Julia 1.5/1.6 to see if @pure makes any difference at all.

I wouldn't be surprised if that were added back then to avoid potential perf issues w.r.t. calling https://github.com/JuliaDiff/ForwardDiff.jl/pull/247/files#diff-aeb036d3acc3ca0b52368fdcc3dbc8bca465b98bc269a1a5b4668b1dcb2e2f7eR12 and https://github.com/JuliaDiff/ForwardDiff.jl/pull/247/files#diff-aeb036d3acc3ca0b52368fdcc3dbc8bca465b98bc269a1a5b4668b1dcb2e2f7eR21

Given the compiler advances since then, it would be great (and unsurprising) if it's no longer useful, but as @KristofferC mentioned, this is probably worth benchmarking in a nested differentiation case that hits this codepath to confirm it won't induce regressions in common cases

@simonbyrne
Copy link
Contributor

I too suspect it would now work without it.

@KristofferC
Copy link
Collaborator

If you have a use case that could have benefited from @pure in Julia 0.6, I would be happy to check it in Julia 1.5/1.6 to see if @pure makes any difference at all.

That's not really how things work. When you go in and remove a performance annotation it is on you to put at least some level of effort to show that the annotation is not needed anymore. For example, a benchmark where the function is used or looking at the generated code with the annotation there and removed.

@mohamed82008
Copy link
Member Author

I used the following code to benchmark:

using ForwardDiff, BenchmarkTools

f(x, y) = ForwardDiff.derivative(y -> ForwardDiff.derivative(x -> (x + y)^2, x), y)
g(x, y) = ForwardDiff.jacobian(y -> ForwardDiff.gradient(x -> sum(x + y)^2, x), y)
h(x, y, z) = ForwardDiff.derivative(z -> ForwardDiff.derivative(y -> ForwardDiff.derivative(x -> (x + y + z)^2, x), y), z)

@btime f(2.0, 2.0);
@btime g($(ones(10)), $(ones(10)));
@btime h(1.0, 2.0, 3.0);
  • With @pure:
julia> @btime f(2.0, 2.0);
  0.001 ns (0 allocations: 0 bytes)

julia> @btime g($(ones(10)), $(ones(10)));
  3.896 μs (15 allocations: 16.83 KiB)

julia> @btime h(1.0, 2.0, 3.0);
  1.500 ns (0 allocations: 0 bytes)
  • Without @pure:
julia> @btime f(2.0, 2.0);
  0.001 ns (0 allocations: 0 bytes)

julia> @btime g($(ones(10)), $(ones(10)));
  3.854 μs (15 allocations: 16.83 KiB)

julia> @btime h(1.0, 2.0, 3.0);
  1.500 ns (0 allocations: 0 bytes)

The second number changes a fair bit every time I run it with and without pure but there seems to be no difference.

The code native generated in both cases:

  • With @pure
julia> @code_native h(1.0, 2.0, 3.0)
	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ REPL[61]:1 within `h'
	subq	$120, %rsp
; │┌ @ derivative.jl:14 within `derivative'
; ││┌ @ REPL[61]:1 within `#137'
; │││┌ @ derivative.jl:14 within `derivative'
; ││││┌ @ REPL[61]:1 within `#138'
	movsd	%xmm1, 24(%rsp)
	movabsq	$4607182418800017408, %rax      ## imm = 0x3FF0000000000000
	movq	%rax, 32(%rsp)
	movsd	%xmm2, 40(%rsp)
	movq	%rax, 48(%rsp)
; │││││┌ @ derivative.jl:14 within `derivative'
; ││││││┌ @ dual.jl:66 within `Dual' @ dual.jl:62 @ dual.jl:55 @ dual.jl:19
	movsd	%xmm0, 8(%rsp)
	movq	%rax, 16(%rsp)
; ││││││└
	movabsq	$"#139", %rax
	leaq	56(%rsp), %rdi
	leaq	24(%rsp), %rsi
	leaq	8(%rsp), %rdx
	callq	*%rax
; │└└└└└
	movsd	112(%rsp), %xmm0                ## xmm0 = mem[0],zero
	addq	$120, %rsp
	retq
	nopw	%cs:(%rax,%rax)
	nop
; └
  • Without @pure
julia> @code_native h(1.0, 2.0, 3.0)
	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ REPL[61]:1 within `h'
	subq	$120, %rsp
; │┌ @ derivative.jl:14 within `derivative'
; ││┌ @ REPL[61]:1 within `#137'
; │││┌ @ derivative.jl:14 within `derivative'
; ││││┌ @ REPL[61]:1 within `#138'
	movsd	%xmm1, 24(%rsp)
	movabsq	$4607182418800017408, %rax      ## imm = 0x3FF0000000000000
	movq	%rax, 32(%rsp)
	movsd	%xmm2, 40(%rsp)
	movq	%rax, 48(%rsp)
; │││││┌ @ derivative.jl:14 within `derivative'
; ││││││┌ @ dual.jl:66 within `Dual' @ dual.jl:62 @ dual.jl:55 @ dual.jl:19
	movsd	%xmm0, 8(%rsp)
	movq	%rax, 16(%rsp)
; ││││││└
	movabsq	$"#139", %rax
	leaq	56(%rsp), %rdi
	leaq	24(%rsp), %rsi
	leaq	8(%rsp), %rdx
	callq	*%rax
; │└└└└└
	movsd	112(%rsp), %xmm0                ## xmm0 = mem[0],zero
	addq	$120, %rsp
	retq
	nopw	%cs:(%rax,%rax)
	nop
; └

Looks identical to me.

@mohamed82008
Copy link
Member Author

Can I please get an approval here to merge?

@mohamed82008
Copy link
Member Author

Can I please get an approval here to merge?

Ok let me request the opposite! If you have any objections, please object in the next 24 hours. If there are no objections, I will merge.

@andreasnoack andreasnoack merged commit 5b4a905 into master Mar 12, 2021
@andreasnoack andreasnoack deleted the mt/remove_pure branch March 12, 2021 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants