[Koha] Help needed with zombie background_jobs processes

Cindy Murdock Ames cmurdock at ccfls.org
Thu Apr 20 04:01:19 NZST 2023


Hi Jonathan,

I just tried sending SHGCHLD to the parent processes, it didn't have any
effect.  The parents are "/usr/bin/perl /usr/share/koha/bin/
background_jobs_worker.pl --queue default" and "/usr/bin/perl
/usr/share/koha/bin/background_jobs_worker.pl --queue long_tasks".

worker-error.log has a few entries like these three from today:
20230419 08:44:53 ccfls-koha-worker-long_tasks: client (pid 12169) killed
by signal 13, respawning
20230419 09:36:06 ccfls-koha-worker: client (pid 14398) killed by signal
13, respawning
20230419 09:59:35 ccfls-koha-worker: client (pid 29935) killed by signal
13, respawning

Those timestamps correspond to three jobs in the jobs queue that didn't
complete and have a "null/n" (n being numbers that I think correspond to
the number of things in the batch).  The first is a batch item record
modification and the other two are holds queue updates.

I cancelled these three jobs and the zombies remained.

worker-output.log has a number of entries like these, but unfortunately
there are no timestamps so I can't link it to anything, although the
timestamp on the file itself is from yesterday at 13:08, which I think
corresponded to a successful staging and import of records.

Use of uninitialized value $subfield_value in pattern match (m//) at
/usr/share/koha/lib/Koha/SimpleMARC.pm line 435.
Use of uninitialized value $subfield_value in string eq at
/usr/share/koha/lib/Koha/SimpleMARC.pm line 435.

I did try something else.  The parent process for the long queue had
apparently already respawned, but the one for the default one hadn't, so I
killed it with -9.  The two zombies that had been there went away and the
default queue restarted.  Before I did that I tried a MARC upload, it was
stuck at 0%.  I cancelled the job and retried it after killing the default
queue and it worked, but it spawned a new zombie which was a child of the
long_tasks queue.  Yesterday it seemed to work if there was only one
zombie, but not two.  No new entries in either of the worker- files.

Thanks for your help.

c.
-----------------------------------------------------------
Cindy Murdock Ames
IT Services Director
Meadville Public Library | CCFLS
https://meadvillelibrary.org | https://ccfls.org

Please report tech support issues in Mantis:  https://mantis.ccfls.org


On Wed, Apr 19, 2023 at 2:44 AM Jonathan Druart <
jonathan.druart at bugs.koha-community.org> wrote:

> Did you have a look at worker-*.log? Nothing useful there?
>
> You can try to send SIGCHLD to the parent to kill the zombie.
>
> Le mar. 18 avr. 2023 à 22:09, Cindy Murdock Ames <cmurdock at ccfls.org> a
> écrit :
> >
> > A few other things I've noticed:
> >
> > - Sometimes the zombie processes will go away on their own, sometimes it
> seems when you retry the MARC import or whatever it was that failed.  This
> one is really weird to me as in all my years as a sysadmin I thought it was
> not possible for zombie processes to go away without a reboot.  But maybe
> that's changed and now zombies can rise from the dead.  Lol.
> >
> > - In looking at the jobs list in Koha, it seems that Holds queue updates
> are especially prone to getting stuck at a progress of null/1.
> >
> > - If you reattempt a job that is stuck (ie, reattempting a MARC file
> upload or what not) it will often succeed.  The original failed job remains
> with a progress of null.
> >
> > c.
> > -----------------------------------------------------------
> > Cindy Murdock Ames
> > IT Services Director
> > Meadville Public Library | CCFLS
> > https://meadvillelibrary.org | https://ccfls.org
> >
> > Please report tech support issues in Mantis:  https://mantis.ccfls.org
> >
> >
> > On Tue, Apr 18, 2023 at 3:55 PM Cindy Murdock Ames <cmurdock at ccfls.org>
> wrote:
> >>
> >> Yes, it's 22.11.04, package version.
> >>
> >> -----------------------------------------------------------
> >> Cindy Murdock Ames
> >> IT Services Director
> >> Meadville Public Library | CCFLS
> >> https://meadvillelibrary.org | https://ccfls.org
> >>
> >>
> >>
> >>
> >> On Tue, Apr 18, 2023 at 2:59 PM Jonathan Druart <
> jonathan.druart at bugs.koha-community.org> wrote:
> >>>
> >>> Hi Cindy,
> >>> Which exact version of Koha 22.11.xx? It should be the latest one.
> >>> Regards,
> >>> Jonathan
> >>>
> >>> Le mar. 18 avr. 2023 à 19:13, Cindy Murdock Ames <cmurdock at ccfls.org>
> a écrit :
> >>> >
> >>> > Hi all,
> >>> >
> >>> > A couple weekends ago I upgraded our Koha instance from 22.05 to
> 22.11, and
> >>> > I'm having trouble with the background_jobs processes becoming
> zombies
> >>> > after a very short amount of time, necessitating a reboot.  I
> suspect it's
> >>> > a misconfiguration on my part, so if someone can shed some light I'd
> really
> >>> > appreciate it!
> >>> >
> >>> > The first symptom was our MARC imports getting stuck at "import
> queued",
> >>> > and after some digging (and thanks to the thread in this list with
> the
> >>> > subject of "Background job / Staging MARC import stuck at 0%" I
> found I was
> >>> > entirely missing the <message_broker> section in our config, so I
> added
> >>> > this:
> >>> >
> >>> >  <message_broker>
> >>> >    <hostname>localhost</hostname>
> >>> >    <port>61613</port>
> >>> >    <username>guest</username>
> >>> >    <password>guest</password>
> >>> >    <vhost></vhost>
> >>> >  </message_broker>
> >>> >
> >>> > Which seemed to resolve it, but now I find that the background_jobs
> >>> > processes are going zombie after processing only a few jobs.  Here's
> some
> >>> > info from the rabbitmq log after restarting the server:
> >>> >
> >>> > =INFO REPORT==== 18-Apr-2023::12:23:46 ===
> >>> > node           : rabbit at ccflskoha
> >>> > home dir       : /var/lib/rabbitmq
> >>> > config file(s) : /etc/rabbitmq/rabbitmq.config (not found)
> >>> > cookie hash    : ojvkUE6eUtku7kHlx3uiFg==
> >>> > log            : /var/log/rabbitmq/rabbit at ccflskoha.log
> >>> > sasl log       : /var/log/rabbitmq/rabbit at ccflskoha-sasl.log
> >>> > database dir   : /var/lib/rabbitmq/mnesia/rabbit at ccflskoha
> >>> >
> >>> > Is it problematic that /etc/rabbitmq/rabbitmq.config is missing?
> Anything
> >>> > else I should be looking at?  We're running on Ubuntu SE 18.04 if
> that is
> >>> > helpful.
> >>> >
> >>> > Thanks much!
> >>> > Cindy
> >>> >
> >>> >
> >>> > -----------------------------------------------------------
> >>> > Cindy Murdock Ames
> >>> > IT Services Director
> >>> > Meadville Public Library | CCFLS
> >>> > https://meadvillelibrary.org | https://ccfls.org
> >>> > _______________________________________________
> >>> >
> >>> > Koha mailing list  http://koha-community.org
> >>> > Koha at lists.katipo.co.nz
> >>> > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>


More information about the Koha mailing list