SolidQueue crashes if database connection is lost, and takes Puma with it. #512

darinwilson · 2025-02-07T01:46:27Z

We're running SolidQueue as a Puma plugin on a Rails 8 app, as our job processing load is currently quite small.

We recently had an incident where the server running Puma temporarily lost the connection to Postgres. This caused SolidQueue to crash with this message:

PQconsumeInput() FATAL:  terminating connection due to administrator command (PG::
ConnectionBad)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

and this in turn took down Puma:

Detected Solid Queue has gone away, stopping Puma...
- Gracefully stopping, waiting for requests to finish

I was able to reproduce this locally by shutting down Postgres after starting Rails.

When running Rails without the SolidQueue Puma plugin, if the database goes away, Rails throws an error when it tries to do something with the database, but Puma stays up and the connections recover when the database comes back online.

If I run SolidQueue separately, via bin/jobs, it also crashes if the database goes away.

Obviously SolidQueue can't be expected to do much without a database, but would it be reasonable for it to behave as Rails does when the db goes offline, i.e. pause its activity and reconnect when the db is available again?

Thanks for all your work on this - SolidQueue has been a fantastic addition to Rails!

The text was updated successfully, but these errors were encountered:

rosa · 2025-02-10T21:06:39Z

Oh, interesting. This happens for the supervisor only, if any of the supervised processes crashes, the supervisor makes sure a new one is started 🤔 I think the supervisor would need some kind of recovery mechanism if the DB fails, but it could also crash for other reasons. I think it makes sense to do this, but I won't have time in the next couple of months at least, so if someone wants to submit a PR doing this, I'll be happy to review.

darinwilson · 2025-02-10T21:51:08Z

Thanks for the feedback - that's good to know that it must be something at the supervisor level.

I'll dig into the code a bit, and see if I can find a solution that might work.

darinwilson mentioned this issue Feb 19, 2025

Allow supervisor to recover after crash #519

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SolidQueue crashes if database connection is lost, and takes Puma with it. #512

SolidQueue crashes if database connection is lost, and takes Puma with it. #512

darinwilson commented Feb 7, 2025 •

edited

Loading

rosa commented Feb 10, 2025

darinwilson commented Feb 10, 2025

SolidQueue crashes if database connection is lost, and takes Puma with it. #512

SolidQueue crashes if database connection is lost, and takes Puma with it. #512

Comments

darinwilson commented Feb 7, 2025 • edited Loading

rosa commented Feb 10, 2025

darinwilson commented Feb 10, 2025

darinwilson commented Feb 7, 2025 •

edited

Loading