Skip to content

Conversation

@Michal-Novomestsky
Copy link
Contributor

Addresses #1743

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is actually masking an error. A ScalarFromTensor should have a scalar gradient, and therefore tensor_from_scalar is correct

@Michal-Novomestsky
Copy link
Contributor Author

Sorry this is actually masking an error. A ScalarFromTensor should have a scalar gradient, and therefore tensor_from_scalar is correct

Sorry I'm not quite following - do you mean that there is a deeper bug unrelated to this code, and fixing that bug means that this PR is unecessary?

@ricardoV94
Copy link
Member

There's a deeper bug, your fix is making it.
Assumptions:

  1. This Op can only be called in a Scalar, it always outputs a Tensor
  2. Any gradient expression coming in reverse_mode (output_gradients) must have the same type as the output, i.e, be a Tensor
  3. It should always be safe and correct to call ScalarFromTensor on this output.

Point 2. is the crux. Some Op gradient is returning a scalar when it should return a tensor, or some invalid graph manipulation was done. Need to find what variable is creating that tensor output gradient. Can you check if the variable has a traceback in var.tag?

@ricardoV94
Copy link
Member

Do you have a MWE that reproduces the bug?

@Michal-Novomestsky
Copy link
Contributor Author

Sorry only just got back to this now!

Do you have a MWE that reproduces the bug?

I believe the example I give in the issue should be appropriate, or would you prefer smth more minimal? (it was as minimal as I could make it from the original bug).

import pytensor.tensor as pt
import pytensor
import pytensor.gradient as tg
from pymc.distributions.dist_math import check_parameters

theta = pt.scalar("theta")
x = pt.scalar("x")

obj = check_parameters(x, theta > 0, msg="theta > 0")
                   
x0, _ = pytensor.tensor.optimize.minimize(objective=obj, x=x)

tg.grad(x0, theta)

Op gradient is returning a scalar when it should return a tensor, or some invalid graph manipulation was done. Need to find what variable is creating that tensor output gradient. Can you check if the variable has a traceback in var.tag?

I'll try to investigate when I get back to my PC around Monday next week.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 10, 2025

It's the minimize L_op fault, it return a tensor gradient for a scalar input.

Here is a slightly more MWE with only PyTensor constructs:

import pytensor.tensor as pt
from pytensor.raise_op import assert_op

theta = pt.scalar("theta")
x = pt.scalar("x")
obj = assert_op(x, theta > 0)            
x0, _ = pt.optimize.minimize(objective=obj, x=x)
pt.grad(x0, theta)

@ricardoV94
Copy link
Member

Should be patched by #1784

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants