That was a great talk. I was somewhat skeptical of this change, but she really looked at it from every angle and makes a strong argument for the benefits of disallowing the construct. Also, I didn't know that (as mentioned in the talk) C# already disallows it.
> The projects examined contained a total of 120,964,221 lines of Python code, and among them the script found 203 instances of control flow instructions in a finally block. Most were return, a handful were break, and none were continue.
I don't really write a lot of Python, but I do write a lot of Java, and `continue` is the main control flow statement that makes sense to me within a finally block.
I think it makes sense when implementing a generic transaction loop, something along the lines of:
<T> T executeTransaction(Function<Transaction, T> fn) {
for (int tries = 0;; tries++) {
var tx = newTransaction();
try {
return fn.apply(tx);
} finally {
if (!tx.commit()) {
// TODO: potentially log number of tries, maybe include a backoff, maybe fail after a certain number
continue;
}
}
}
}
In these cases "swallowing" the exception is often intentional, since the exception could be due to some logic failing as a result of inconsistent reads, so the transaction should be retried.
The alternative ways of writing this seem more awkward to me. Either you need to store the result (returned value or thrown exception) in one or two variables, or you need to duplicate the condition and the `continue;` behaviour. Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.
If there are particular exceptions that should not be retried, these would need to be caught/rethrown and a boolean set to disable the condition in the `finally` block, though to me this still seems easier to reason about than the alternatives.
> Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.
Except that is not the documented intent of the `finally` construct:
The finally block always executes when the try block exits.
This ensures that the finally block is executed even if an
unexpected exception occurs. But finally is useful for more
than just exception handling — it allows the programmer to
avoid having cleanup code accidentally bypassed by a
return, continue, or break. Putting cleanup code in a
finally block is always a good practice, even when no
exceptions are anticipated.[0]
Using `finally` for implementing retry logic can be done, as you have illustrated, but that does not mean it is "the best way of denoting the intention." One could argue this is a construct specific to Java (the language) and does not make sense outside of this particular language-specific idiom.
> The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
A transaction failing is the opposite of an unexpected event. Transactions failing is a central use case of any transaction. Therefore it should be handled explicitly instead of using exceptions.
Exceptions are for unexpected events such as the node running out of memory, or a process failing to write to disk.
> A transaction failing is the opposite of an unexpected event.
That's why it's denoted by a non-exceptional return value from `tx.commit()` in my sample code. When I've talked about exceptions here, I'm talking about exceptions raised within the transaction. If the transaction succeeds, those exceptions should be propagated to the calling code.
> Exceptions are for unexpected events such as the node running out of memory, or a process failing to write to disk.
Discussing valid uses of exceptions seems orthogonal to this (should OOM lead to a catchable exception [0], or should it crash the process?). In any case, if the process is still alive and the transaction code determines without error that "yes, this transaction was invalid due to other contending transactions", it should retry the transaction. If something threw due to lack of memory or disk space, chances are it will throw again within a successful transaction and the error will be propagated.
[0] As alluded to in my first post, you might want to add some special cases for exceptions/errors that you want to immediately propagate instead of retrying. Eg, you might treat `Error` subtypes differently, which includes `OutOfMemoryError` and other cases that suggest the program is in a potentially unusable state, but this still isn't required according to the intent of the transactional logic.
> The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
First, having a `commit` unconditionally attempted when an exception is raised would surprise many developers. Exceptions in transactional logic are often used to represent a "rollback persistent store changes made thus far" scenario.
Second, using a condition within `finally` to indicate a retry due to a `commit` failing could be expressed in a clearer manner by having it within the `try` block as described by IntelliJ here[0].
> Exceptions in transactional logic are often used to represent a "rollback persistent store changes made thus far" scenario.
Handling can be added to change the transaction to be read-only if the inner code throws a particular exception, but the consistency should still be checked through a `commit` phase (at least in an OCC setting), so the `continue` in `finally` is still the correct way to do it.
> could be expressed in a clearer manner by having it within the `try` block as described by IntelliJ here[0].
Wrong link? The only solution I see there is to add a comment to suppress the warning, which sounds fine to me (eg, analogous to having a `// fallthrough` comment when intentionally omitting `break` statements within `switch`, since I can agree that both of these things are uncommon, but sometimes desirable).
> Handling can be added to change the transaction to be read-only if the inner code throws a particular exception, but the consistency should still be checked through a `commit` phase (at least in an OCC setting), so the `continue` in `finally` is still the correct way to do it.
This approach fails to account for `fn` performing multiple mutations where an exception is raised from statement N, where N > 1.
For example, suppose `fn` successfully updates a record in table `A`, then attempts to insert a record into table `B` which produces a constraint violation exception[0]. Unconditionally performing a `commit` in the `finally` block will result in the mutation in table `A` being persisted, thus resulting in an invalid system persistent state.
If the `try` block performed the `commit` and the `finally` block unconditionally performed a `rollback`, then the behavior I believe sought would be sound.
No, it's the link I intended. The purpose of it was to provide the warning anyone working in a translation unit using the technique originally proffered would see as well as be a starting point for research.
Doesn't that code ignore errors even if it runs out of retries? Don't you want to log every Exception that happens, even if the transaction will be retried?
A result of an inconsistent transaction should be discarded whether it's a return value or a thrown exception. If it runs out of tries another error should be thrown. This should only happen due to contention (overlapping transactions), not due to a logical exception within the transaction.
You can add extra logging to show results or exceptions within the transaction if you want (for the exception this would simply be a `catch` just before the `finally` that logs and rethrows).
I've omitted these extra things because it's orthogonal to the point that the simplest way to express this logic is by having the `continue` control flow unconditional on whether the code was successful .. which is what you use `finally` for.
If you did this in Rust noone would complain, since the overall result is expressed as a first-class `Result<T, E>` value that can naturally be discarded. This is why Rust doesn't have `finally`.
Rust is also a lot more permissive about use of control flow, since you can write things like `foo(if x { y } else { continue }, bar)`.
Personally, I prefer when the language gives a bit more flexibility here. Of course you can write things that are difficult to understand, but my stance is still that my example code above is the simplest way to write the intended logic, until someone demonstrates otherwise.
I don't think this is a restriction that generally helps with code quality. If anything I've probably seen more bad code due to a lack of finding the simplest way to express control flow of an algorithm.
I'm sure there's some train of thought that says that continue/break/return from a loop is bad (see proponents of `Array.prototype.forEach` in JS), but I disagree with it.
and the principle still applies. The simplest solution still involves a `continue` within the `finally` block.
Whether it's a good idea to actually do this directly using SQL connections is another question .. SQL databases usually use pessimistic locking, where the transaction failures are actually "deadlocks" that are preferably avoided through careful ordering of operations within the transaction (or more commonly, YOLOing it using an isolation level that allows read anomalies). Without going into all the details, this probably has a large influence over the design of the SQL APIs you're referring to.
You want to at least check that the exception was raised in the absence of read anomalies. The check for read anomalies in OCC happens during the commit phase.
Setting a transaction to read-only on error is possible using the code (using a rethrowing catch within the transaction), but this is not universally desirable.
If you're using transactions to run fairly arbitrary code atomically (assuming no static state outside of the transaction), the expected behaviour would be that modifications leading up to an exception (in a non-anomalous transaction) are still persisted. Eg, imagine the code within the transaction is updating a persisted retry counter before performing a fallible operation. In this case you want the counter to be updated so that the transaction doesn't just fail an infinite number of times, since each time you roll back on error you're just restoring the state that leads to the error.
Another case would be where the exception is due to something that was written within the transaction. If the exception were raised but the writes were not persisted, it would at least be confusing seeing the exception, and possibly logically incorrect depending on how the irrelevant exception is handled (since it's due to data that theoretically doesn't exist).