Skip to content

Conversation

@johnslavik
Copy link
Contributor

@johnslavik johnslavik commented Dec 7, 2025

Oh the irony that this was never a real path.

I need to figure out how to test this -- it's a regression, after all, so no excuses this time.

CC @ZeroIntensity needs backport to 3.13, needs backport to 3.14

@johnslavik johnslavik marked this pull request as draft December 7, 2025 01:41
@johnslavik johnslavik changed the title Don't pass the "real path" of Pdb script target to system functions gh-142315: Don't pass the "real path" of Pdb script target to system functions Dec 7, 2025
@ZeroIntensity ZeroIntensity added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Dec 7, 2025
@johnslavik
Copy link
Contributor Author

I need to figure out how to test this -- it's a regression, after all, so no excuses this time.

Or... maybe I just mention the issue in a comment above the affected lines -- this will be my escape hatch to not write tests 💡

@johnslavik johnslavik force-pushed the pdb-realpath-pseudofs-problem branch 2 times, most recently from ac11ff4 to 682069d Compare December 7, 2025 13:26
@johnslavik johnslavik force-pushed the pdb-realpath-pseudofs-problem branch from 682069d to 7050770 Compare December 7, 2025 13:26
@johnslavik
Copy link
Contributor Author

Ok, this one is clean and works!

@johnslavik johnslavik marked this pull request as ready for review December 7, 2025 13:28
@johnslavik johnslavik force-pushed the pdb-realpath-pseudofs-problem branch from 7767efa to d9932f5 Compare December 7, 2025 13:35
@jaraco
Copy link
Member

jaraco commented Dec 7, 2025

Thanks for the contrib!

I need to figure out how to test this -- it's a regression, after all, so no excuses this time.

Or... maybe I just mention the issue in a comment above the affected lines -- this will be my escape hatch to not write tests 💡

My preference would be to create a regression test. It seems it shouldn't be too hard to create or identify a pseudofs object to pass in and catch the SystemExit. I'll work on one.

I saw in email that @johnslavik did some analysis on the root cause in readpath/readlink. It'd be nice to get that published somewhere. Even if there's nothing wrong with Python's wrapper around realpath, it may still be worth filing an issue to capture the investigation and consider updating the docs to flag this potential issue.

@jaraco
Copy link
Member

jaraco commented Dec 7, 2025

I've created #142383 with a test. Assuming that test is failing as expected, I'll try merging it with this fix.

@johnslavik johnslavik force-pushed the pdb-realpath-pseudofs-problem branch from 60aa009 to e42ba5e Compare December 7, 2025 18:24
@johnslavik johnslavik force-pushed the pdb-realpath-pseudofs-problem branch from e42ba5e to 5b12904 Compare December 7, 2025 18:25
@johnslavik
Copy link
Contributor Author

I think I like it the way it is now. It's a problem of maintaining the equivalence invariant -- easy to follow.

@jaraco jaraco force-pushed the pdb-realpath-pseudofs-problem branch from 264e306 to 4ca2bcf Compare December 7, 2025 20:25
@johnslavik
Copy link
Contributor Author

@jaraco Can we display self._target in the error instead of target?

@jaraco
Copy link
Member

jaraco commented Dec 7, 2025

@jaraco Can we display self._target in the error instead of target?

We could; it's not obvious to me what is the best choice. I elected not to change the behavior from what was stable (and just surgically address the one edge case). What's the case for using _target (realpath)? What was used on older Pythons?

@johnslavik
Copy link
Contributor Author

johnslavik commented Dec 7, 2025

What's the case for using _target (realpath)?

I'm not aiming to display realpath in the message, but to display the same path that was checked, not something potentially different which can lead to a confusing or incorrect message. It was part of the report -- had the same target that was checked been used in the error message, it would have been much clearer what happened. I'm fine with not displaying realpath, but only provided that we don't check the realpath -- so, what we check and what we display is always the same (this is the most important to me, because only then the message is certainly correct).

Apart from that, self._safe_realpath already does check whether the realpath exists, so to avoid checking the same path twice, I think that we should instead first check the original path and only then normalize to safe realpath. 4ca2bcf reverts this proposal.

What was used on older Pythons?

The discrepancy between what is displayed and what is checked was introduced in GH-26992.
Prior to that, realpath wasn't used for anything other than the sys.path[0] setup.

@jaraco
Copy link
Member

jaraco commented Dec 8, 2025

Thanks for the explanation. I do agree, it's a little disorienting that the check is on one value but the error message reflects a different value.

As you pointed out, GH-26992 originated this issue, and in that change, I explicitly attempted to avoid or to account for any meaningful changes to behavior. That included not changing the input path in the error message. I suspect it's important to the user that whatever they supplied as the script name is reflected in the error message (and not the realpath-mutated value).

So maybe it makes sense to use target for both the check and the error message (as you've proposed).

If that's important, that aspect is not captured by a regression test. I think we need a regression test or at least a comment that captures this refined expectation.

As I look at this further, I'm struggling to understand why the realpath issue didn't affect older Pythons.

Prior to that, realpath wasn't used for anything other than the sys.path[0] setup.

Except it was. The mutation happened here, which was then passed to _runscript here, which is consumed here. I don't yet understand why the broken realpath didn't cause problems subsequently.

@jaraco
Copy link
Member

jaraco commented Dec 8, 2025

Oh! I see now. The use of realpath was introduced in #23338. That suggests that the proper fix may be to go back to that older behavior and narrow the use of realpath to the sys.path[0] mutation.

@jaraco
Copy link
Member

jaraco commented Dec 8, 2025

In 855aac3, I've isolated the realpath usage to the sys.path[0] mutation. This change now fully restores the prior expectation that the indicated target is used as input. Does that fully address the issue?

@jaraco
Copy link
Member

jaraco commented Dec 8, 2025

That doesn't work. It causes a regression in test_issue42383.

@johnslavik
Copy link
Contributor Author

johnslavik commented Dec 8, 2025

Thanks for the explanation. I do agree, it's a little disorienting that the check is on one value but the error message reflects a different value.

Exactly.

As you pointed out, GH-26992 originated this issue, and in that change, I explicitly attempted to avoid or to account for any meaningful changes to behavior. That included not changing the input path in the error message. I suspect it's important to the user that whatever they supplied as the script name is reflected in the error message (and not the realpath-mutated value).

Precisely.

So maybe it makes sense to use target for both the check and the error message (as you've proposed).

If that's important, that aspect is not captured by a regression test. I think we need a regression test or at least a comment that captures this refined expectation.

Sure!

As I look at this further, I'm struggling to understand why the realpath issue didn't affect older Pythons.

Prior to that, realpath wasn't used for anything other than the sys.path[0] setup.

Except it was. The mutation happened here, which was then passed to _runscript here, which is consumed here. I don't yet understand why the broken realpath didn't cause problems subsequently.

Thank you -- that's correct, I must have misread it.

@jaraco Would the logic from 5b12904 suffice? I believe it passed all tests and it solved both problems, namely (1) potentially incorrect error message (2) support for live descriptors of internal kernel objects.
I can work further to adapt the tests to check the error messages as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants