Re: [Koha] [Koha-devel] What's on in koha-devel #11
Hello Wendy, (Answering to the list) I would say you should try both and see where you fit best :) The topics for the meetings are very different, you can find the topics in advanced on the wiki (and can add yours!) Regards, Jonathan On Wed, 5 Apr 2017 at 13:23 Wendy Sharkey <catalog@bfli.org> wrote:
Of the two meetings you mention, which one would best fit a librarian user and abuser?
Wendy Sharkey Bennington, Vermont
------------------------------ *From:* Jonathan Druart <jonathan.druart@bugs.koha-community.org> *To:* "koha@lists.katipo.co.nz" <koha@lists.katipo.co.nz>; " koha-devel@lists.koha-community.org" <koha-devel@lists.koha-community.org>
*Sent:* Tuesday, April 4, 2017 9:55 AM *Subject:* [Koha-devel] What's on in koha-devel #11
Hello librarians and developers,
Not much has happened this month in terms of development. Things are going very slowly at the moment, despite the hackfest in Marseille.
We see a lot of new people on the mailing list and the IRC channel asking for questions, but we need more people to be involved in the discussions and the signoff process. There are hundreds of people on this list, you need to know that the Koha community needs you to make the project moving forward. I know that nobody has enough time, but you know one hour per week of even per month can help a lot. It is very easy, fun and interesting :)
= How to get involved? =
The first action would be to join us on the #koha IRC channel and the koha-devel mailing list. Start by presenting yourself, where do you come from, what do you do? For how long have you been using Koha? How do you think you can help?
Attend meetings! We have two kinds of monthly meeting. There is a general one to talk about koha-related stuffs, and the development meeting to talk about technical stuffs. It is important for the community to have a place to be all together and try to make things move. If you are a librarian, we need you to tell developers the directions they need to go.
Open bug reports It is important to know that everybody can open new bug reports on our bug tracker (https://bugs.koha-community.org). It is useful for the development team to know the new bugs. And it is also important for you to search for known bugs. A lot of bugs are known from developers but we do not know how to fix them because we are waiting for feedbacks from users. The dashboard (http://dashboard.koha-community.org) is very helpful to know the "hot" bugs that need to be fixed/tested/QAed. The "Overall bug traker health status" section displays the number of new bugs that are important to follow, make moving. If a developer submits a patch for the bug you opened, try and test it!
Test bugs Here is how you can really makes things move for the community. You become involved in the development process and make the Koha project better. Sandboxes (https://wiki.koha-community.org/wiki/Sandboxes) are available to test patch easily. You do not need to install anything on your computer and do not need any technical skills. If you are a Koha user, you can test patches!
Write patches If you are a developer or have a minimum of technical skills you can write your own patches and fixes bugs or develop new feature. Our dedicated wiki page (wiki.koha-community.org/wiki/Submitting_A_Patch) will guide you over the different steps. It is very easy to get a development environment using KohaDevBox ( https://github.com/digibib/kohadevbox) Start with small patches :)
= Refactoring = If you have read my previous "what's on in koha-devel" emails, you may have noticed I talked a lot of "refactoring" bugs. And maybe you do not know what it is and what it brings. The Koha codebase is aged/old, and has grown very quickly since 2006. Since the last few years the development team decided to make the code more robust and slow the integration of patches, getting a stronger integration process. That's why we need to have at least one tester validating the patches, then one QAer to review the patch technically and finally the Release Manager to push it. That's mean 4 independent persons for a single patch. That made our codebase stronger and less regression prone. Some part of the code now needs to be rewritten, and it is why I have focussed on the "refactoring" work over the last year. The goal is to rewrite modules of Koha, step by step. It cleans the code, makes it more readable, reduces the number of lines, centralise the responsibilities, homogenise the code, fixes bugs, adds test coverage, etc. On the mid-long term it means a modern application, easy to maintain, robust, flexible. The bugs will be easier to fix, the enhancements will be quicker to develop and so less expensive. Do you start to understand how it is useful? So yes, it's boring to test because it does not bring anything new to the interface, but the project needs it.
If you have been using Koha for months or years, it made you save money, it makes you happy to use it and you love it. If you already got answer from people on the mailing lists or the IRC channel, it's time to say thank you, by getting involved!
If you have any specific questions on how to get involved, you can join me on the IRC channel or by email. I can help and guide you if necessary.
A developer that loves Koha but needs help,
Jonathan _______________________________________________ Koha-devel mailing list Koha-devel@lists.koha-community.org http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
Hi We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04. Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources: phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet). Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"? Best wishes and Happy Easter: Michael
Hello Michael, How many issues and holds has the patron? Regards, Jonathan On Thu, 13 Apr 2017 at 15:13 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet).
Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"?
Best wishes and Happy Easter: Michael _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Jonathan Some patrons have no issues/holds at all, some have very few (2 or 3), some have some more. As I wrote the slow login process doesn't happen all the time, but more or less often. At the moment we can't see a pattern why it happens, but it does happen... Is there maybe a way how to debug "opac-user.pl" so we could find out what the script is actually doing during the slow login process? Regards, Michael Am 17.04.2017 um 14:37 schrieb Jonathan Druart:
Hello Michael,
How many issues and holds has the patron?
Regards, Jonathan
On Thu, 13 Apr 2017 at 15:13 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet).
Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"?
Best wishes and Happy Easter: Michael _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit. On Mon, 17 Apr 2017 at 17:43 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Jonathan
Some patrons have no issues/holds at all, some have very few (2 or 3), some have some more. As I wrote the slow login process doesn't happen all the time, but more or less often. At the moment we can't see a pattern why it happens, but it does happen...
Is there maybe a way how to debug "opac-user.pl" so we could find out what the script is actually doing during the slow login process?
Regards, Michael
Am 17.04.2017 um 14:37 schrieb Jonathan Druart:
Hello Michael,
How many issues and holds has the patron?
Regards, Jonathan
On Thu, 13 Apr 2017 at 15:13 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet).
Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"?
Best wishes and Happy Easter: Michael _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Michael What version of Koha is it? Do you have errors or messages in the log files? You can monitor them using tail. At first place I would have a look at opac-error.log and plack-error.log, something like: tail -f opac-error.log Kind regards Marc Am 17.04.2017 um 23:20 schrieb Jonathan Druart:
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit.
On Mon, 17 Apr 2017 at 17:43 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Jonathan
Some patrons have no issues/holds at all, some have very few (2 or 3), some have some more. As I wrote the slow login process doesn't happen all the time, but more or less often. At the moment we can't see a pattern why it happens, but it does happen...
Is there maybe a way how to debug "opac-user.pl" so we could find out what the script is actually doing during the slow login process?
Regards, Michael
Am 17.04.2017 um 14:37 schrieb Jonathan Druart:
Hello Michael,
How many issues and holds has the patron?
Regards, Jonathan
On Thu, 13 Apr 2017 at 15:13 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet).
Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"?
Best wishes and Happy Easter: Michael _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Marc and Jonatahn Marc wrote:
What version of Koha is it?
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Do you have errors or messages in the log files? You can monitor them using tail. At first place I would have a look at opac-error.log and plack-error.log, something like: tail -f opac-error.log
No, they are no errors or messages. As soon as the login attempt in the Koha OPAC is made the, the following process taking 99% of the CPU can be seen with "top" or "ps -ef": phsh-ko+ 62293 26131 30 14:39 ? 00:00:22 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl When the login finally succeeds (after maybe 30 seconds) the following is written in the file "plack.log": 178.238.175.156 - - [18/Apr/2017:14:40:28 +0200] "POST /opac/opac-user.pl HTTP/1.1" 200 26320 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-main.pl?logout.x=1" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" But nothing is written to the file "opac-error.log". Jonathan wrote:
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit.
I activated the Mysql slow query log during runtime, using the following commands: mysql> set global log_slow_queries = 1; mysql> set global slow_query_log_file = '/tmp/slow.txt'; Then I made another login attempt in the Koha OPAC. The file "/tmp/slow.txt" is created but it doesn't list any slow query, so I think the problem is not in the query itself: # more /tmp/slow.txt /usr/sbin/mysqld, Version: 5.5.54-0+deb8u1 ((Debian)). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock Time Id Command Argument Then I made some attempt to use strace, but I'm not sure how to really use it and especially how to interpret the output... So what i've tried is: 1. I made a login attempt 2. As soon as I saw the query was taking 99% of the CPU i picked the PID (here: 62503) from the output of "top" and executed the following command: # strace -p 62503 The output then looked like the folloing (full output at the end of this email): Process 62503 attached followed by many many lines like the following: brk(0xb63b000) = 0xb63b000 brk(0xb65c000) = 0xb65c000 brk(0xb67d000) = 0xb67d000 Then after about 30 seconds it says: write(12, "Status: 200 OK\r\nSet-Cookie: CGIS"..., 8192) = 8192 write(12, "dsearch -->\n \n\n <div class=\"row-"..., 8192) = 8192 write(12, "GS_DELETED = _(\"Hinzugef\303\274gte Ta"..., 8192) = 8192 rt_sigprocmask(SIG_BLOCK, [STKFLT], [], 8) = 0 rt_sigaction(SIGSTKFLT, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_7], [], 8) = 0 rt_sigaction(SIGRT_7, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_11], [], 8) = 0 rt_sigaction(SIGRT_11, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [TERM], [], 8) = 0 rt_sigaction(SIGTERM, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_4], [], 8) = 0 rt_sigaction(SIGRT_4, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [XCPU], [], 8) = 0 rt_sigaction(SIGXCPU, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [KILL], [], 8) = 0 rt_sigaction(SIGKILL, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 0x7ffc3dff2ad0, 8) = -1 EINVAL (Invalid argument) rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_15], [], 8) = 0 rt_sigaction(SIGRT_15, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [URG], [], 8) = 0 rt_sigaction(SIGURG, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_22], [], 8) = 0 rt_sigaction(SIGRT_22, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_17], [], 8) = 0 rt_sigaction(SIGRT_17, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [IO], [], 8) = 0 rt_sigaction(SIGIO, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [TTIN], [], 8) = 0 rt_sigaction(SIGTTIN, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_14], [], 8) = 0 rt_sigaction(SIGRT_14, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_IGN, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CONT], [], 8) = 0 rt_sigaction(SIGCONT, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_26], [], 8) = 0 rt_sigaction(SIGRT_26, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [FPE], [], 8) = 0 rt_sigaction(SIGFPE, {SIG_IGN, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_IGN, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_6], [], 8) = 0 rt_sigaction(SIGRT_6, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_24], [], 8) = 0 rt_sigaction(SIGRT_24, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [QUIT], [], 8) = 0 rt_sigaction(SIGQUIT, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_13], [], 8) = 0 rt_sigaction(SIGRT_13, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 rt_sigaction(SIGINT, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_8], [], 8) = 0 rt_sigaction(SIGRT_8, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_32], [], 8) = 0 rt_sigaction(SIGRT_32, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_27], [], 8) = 0 rt_sigaction(SIGRT_27, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_16], [], 8) = 0 rt_sigaction(SIGRT_16, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_29], [], 8) = 0 rt_sigaction(SIGRT_29, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_30], [], 8) = 0 rt_sigaction(SIGRT_30, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [PROF], [], 8) = 0 rt_sigaction(SIGPROF, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RTMIN], [], 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [TRAP], [], 8) = 0 rt_sigaction(SIGTRAP, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_5], [], 8) = 0 rt_sigaction(SIGRT_5, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [PWR], [], 8) = 0 rt_sigaction(SIGPWR, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_23], [], 8) = 0 rt_sigaction(SIGRT_23, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_3], [], 8) = 0 rt_sigaction(SIGRT_3, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [STOP], [], 8) = 0 rt_sigaction(SIGSTOP, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 0x7ffc3dff2ad0, 8) = -1 EINVAL (Invalid argument) rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [SYS], [], 8) = 0 rt_sigaction(SIGSYS, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0 rt_sigaction(SIGWINCH, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_12], [], 8) = 0 rt_sigaction(SIGRT_12, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [USR2], [], 8) = 0 rt_sigaction(SIGUSR2, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_20], [], 8) = 0 rt_sigaction(SIGRT_20, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_21], [], 8) = 0 rt_sigaction(SIGRT_21, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [XFSZ], [], 8) = 0 rt_sigaction(SIGXFSZ, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_28], [], 8) = 0 rt_sigaction(SIGRT_28, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [ABRT], [], 8) = 0 rt_sigaction(SIGABRT, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_18], [], 8) = 0 rt_sigaction(SIGRT_18, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [VTALRM], [], 8) = 0 rt_sigaction(SIGVTALRM, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [IO], [], 8) = 0 rt_sigaction(SIGIO, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [TTOU], [], 8) = 0 rt_sigaction(SIGTTOU, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [ILL], [], 8) = 0 rt_sigaction(SIGILL, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [ABRT], [], 8) = 0 rt_sigaction(SIGABRT, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_19], [], 8) = 0 rt_sigaction(SIGRT_19, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_1], [], 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_31], [], 8) = 0 rt_sigaction(SIGRT_31, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_25], [], 8) = 0 rt_sigaction(SIGRT_25, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [BUS], [], 8) = 0 rt_sigaction(SIGBUS, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_10], [], 8) = 0 rt_sigaction(SIGRT_10, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [USR1], [], 8) = 0 rt_sigaction(SIGUSR1, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [TSTP], [], 8) = 0 rt_sigaction(SIGTSTP, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_9], [], 8) = 0 rt_sigaction(SIGRT_9, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [HUP], [], 8) = 0 rt_sigaction(SIGHUP, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, {0x7fd5bd4d4b40, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0 rt_sigaction(SIGALRM, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [SYS], [], 8) = 0 rt_sigaction(SIGSYS, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [SEGV], [], 8) = 0 rt_sigaction(SIGSEGV, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, [RT_2], [], 8) = 0 rt_sigaction(SIGRT_2, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, {SIG_DFL, [], SA_RESTORER, 0x7fd5bcd11890}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 chdir("/") = 0 prctl(PR_SET_NAME, 0x1d79ae0, 0, 0, 0) = 0 write(12, "argets\" : [ \"title-string\" ] }\n "..., 1696) = 1696 lseek(12, 0, SEEK_SET) = 0 lseek(12, 0, SEEK_CUR) = 0 ioctl(12, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc3dff2bd0) = -1 ENOTTY (Inappropriate ioctl for device) lseek(12, 0, SEEK_CUR) = 0 fstat(12, {st_mode=S_IFREG|0600, st_size=26272, ...}) = 0 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 fstat(12, {st_mode=S_IFREG|0600, st_size=26272, ...}) = 0 read(12, "Status: 200 OK\r\nSet-Cookie: CGIS"..., 8192) = 8192 read(12, "dsearch -->\n \n\n <div class=\"row-"..., 8192) = 8192 read(12, "GS_DELETED = _(\"Hinzugef\303\274gte Ta"..., 8192) = 8192 read(12, "argets\" : [ \"title-string\" ] }\n "..., 8192) = 1696 read(12, "", 8192) = 0 close(12) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1892, ...}) = 0 write(4, "178.238.175.156 - - [18/Apr/2017"..., 238) = 238 write(9, "HTTP/1.1 200 OK\r\nCache-Control: "..., 418) = 418 write(9, "\n\n\n\n\n\n<!DOCTYPE html>\n<!-- TEMPL"..., 25902) = 25902 select(16, [9], NULL, NULL, {1, 0}) = 0 (Timeout) close(9) = 0 kill(26131, SIG_0) = 0 write(8, "62503 waiting\n", 14) = 14 getsockopt(6, SOL_SOCKET, SO_TYPE, [1], [4]) = 0 accept(6, At this point the login attempt finally succeeds. Actually I have no idea what this means - can you please help? Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Use `strace -c -p PID` and paste the whole output please (hastebin or similar) On Tue, 18 Apr 2017 at 10:23 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Marc and Jonatahn
Marc wrote:
What version of Koha is it?
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Do you have errors or messages in the log files? You can monitor them using tail. At first place I would have a look at opac-error.log and plack-error.log, something like: tail -f opac-error.log
No, they are no errors or messages. As soon as the login attempt in the Koha OPAC is made the, the following process taking 99% of the CPU can be seen with "top" or "ps -ef":
phsh-ko+ 62293 26131 30 14:39 ? 00:00:22 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
When the login finally succeeds (after maybe 30 seconds) the following is written in the file "plack.log":
178.238.175.156 - - [18/Apr/2017:14:40:28 +0200] "POST /opac/opac-user.pl HTTP/1.1" 200 26320 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-main.pl?logout.x=1" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"
But nothing is written to the file "opac-error.log".
Jonathan wrote:
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit.
I activated the Mysql slow query log during runtime, using the following commands:
mysql> set global log_slow_queries = 1; mysql> set global slow_query_log_file = '/tmp/slow.txt';
Then I made another login attempt in the Koha OPAC. The file "/tmp/slow.txt" is created but it doesn't list any slow query, so I think the problem is not in the query itself:
# more /tmp/slow.txt /usr/sbin/mysqld, Version: 5.5.54-0+deb8u1 ((Debian)). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock Time Id Command Argument
Then I made some attempt to use strace, but I'm not sure how to really use it and especially how to interpret the output... So what i've tried is:
1. I made a login attempt
2. As soon as I saw the query was taking 99% of the CPU i picked the PID (here: 62503) from the output of "top" and executed the following command:
# strace -p 62503
The output then looked like the folloing (full output at the end of this email):
Process 62503 attached
followed by many many lines like the following:
brk(0xb63b000) = 0xb63b000 brk(0xb65c000) = 0xb65c000 brk(0xb67d000) = 0xb67d000
Then after about 30 seconds it says: [SKIP]
At this point the login attempt finally succeeds. Actually I have no idea what this means - can you please help?
Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 <+41%2061%20261%2055%2061> · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Jonathan I made a new attempt to use strace which has a different and much shorter output: 1. I made a login attempt. 2. As soon as I saw the query was taking 99% of the CPU I picked the PID (here: 63895) from the output of "top" and executed the following command: # strace -c -p 63895 When the login finally succeeded I ended strace with CTRL+C. The output then looked like the following (output can also be seen via http://adminkuhn.ch/download/strace.txt): Process 63895 attached ^CProcess 63895 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 146 read 0.00 0.000000 0 137 write 0.00 0.000000 0 8 open 0.00 0.000000 0 13 close 0.00 0.000000 0 54 22 stat 0.00 0.000000 0 19 fstat 0.00 0.000000 0 5 lstat 0.00 0.000000 0 123 poll 0.00 0.000000 0 16 2 lseek 0.00 0.000000 0 404 brk 0.00 0.000000 0 202 6 rt_sigaction 0.00 0.000000 0 416 rt_sigprocmask 0.00 0.000000 0 12 12 ioctl 0.00 0.000000 0 2 select 0.00 0.000000 0 4 alarm 0.00 0.000000 0 1 accept 0.00 0.000000 0 2 getpeername 0.00 0.000000 0 2 getsockopt 0.00 0.000000 0 2 kill 0.00 0.000000 0 11 fcntl 0.00 0.000000 0 4 getdents 0.00 0.000000 0 1 getcwd 0.00 0.000000 0 3 chdir 0.00 0.000000 0 1 unlink 0.00 0.000000 0 2 geteuid 0.00 0.000000 0 1 getegid 0.00 0.000000 0 2 getgroups 0.00 0.000000 0 4 prctl 0.00 0.000000 0 2 openat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 1599 42 total I wonder if this output is helpful? Best wishes: Michael Am 18.04.2017 um 16:02 schrieb Jonathan Druart:
Use `strace -c -p PID` and paste the whole output please (hastebin or similar)
On Tue, 18 Apr 2017 at 10:23 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Marc and Jonatahn
Marc wrote:
What version of Koha is it?
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Do you have errors or messages in the log files? You can monitor them using tail. At first place I would have a look at opac-error.log and plack-error.log, something like: tail -f opac-error.log
No, they are no errors or messages. As soon as the login attempt in the Koha OPAC is made the, the following process taking 99% of the CPU can be seen with "top" or "ps -ef":
phsh-ko+ 62293 26131 30 14:39 ? 00:00:22 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
When the login finally succeeds (after maybe 30 seconds) the following is written in the file "plack.log":
178.238.175.156 - - [18/Apr/2017:14:40:28 +0200] "POST /opac/opac-user.pl HTTP/1.1" 200 26320 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-main.pl?logout.x=1" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"
But nothing is written to the file "opac-error.log".
Jonathan wrote:
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit.
I activated the Mysql slow query log during runtime, using the following commands:
mysql> set global log_slow_queries = 1; mysql> set global slow_query_log_file = '/tmp/slow.txt';
Then I made another login attempt in the Koha OPAC. The file "/tmp/slow.txt" is created but it doesn't list any slow query, so I think the problem is not in the query itself:
# more /tmp/slow.txt /usr/sbin/mysqld, Version: 5.5.54-0+deb8u1 ((Debian)). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock Time Id Command Argument
Then I made some attempt to use strace, but I'm not sure how to really use it and especially how to interpret the output... So what i've tried is:
1. I made a login attempt
2. As soon as I saw the query was taking 99% of the CPU i picked the PID (here: 62503) from the output of "top" and executed the following command:
# strace -p 62503
The output then looked like the folloing (full output at the end of this email):
Process 62503 attached
followed by many many lines like the following:
brk(0xb63b000) = 0xb63b000 brk(0xb65c000) = 0xb65c000 brk(0xb67d000) = 0xb67d000
Then after about 30 seconds it says: [SKIP]
At this point the login attempt finally succeeds. Actually I have no idea what this means - can you please help?
Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 <+41%2061%20261%2055%2061> · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
I do not understand why %time is 0, should not be, especially if the process takes 30s Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both. You should write a script not to have to retrieve the pid manually, something using `ps aux | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2` On Tue, 18 Apr 2017 at 11:31 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Jonathan
I made a new attempt to use strace which has a different and much shorter output:
1. I made a login attempt.
2. As soon as I saw the query was taking 99% of the CPU I picked the PID (here: 63895) from the output of "top" and executed the following command:
# strace -c -p 63895
When the login finally succeeded I ended strace with CTRL+C. The output then looked like the following (output can also be seen via http://adminkuhn.ch/download/strace.txt):
Process 63895 attached ^CProcess 63895 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 146 read 0.00 0.000000 0 137 write 0.00 0.000000 0 8 open 0.00 0.000000 0 13 close 0.00 0.000000 0 54 22 stat 0.00 0.000000 0 19 fstat 0.00 0.000000 0 5 lstat 0.00 0.000000 0 123 poll 0.00 0.000000 0 16 2 lseek 0.00 0.000000 0 404 brk 0.00 0.000000 0 202 6 rt_sigaction 0.00 0.000000 0 416 rt_sigprocmask 0.00 0.000000 0 12 12 ioctl 0.00 0.000000 0 2 select 0.00 0.000000 0 4 alarm 0.00 0.000000 0 1 accept 0.00 0.000000 0 2 getpeername 0.00 0.000000 0 2 getsockopt 0.00 0.000000 0 2 kill 0.00 0.000000 0 11 fcntl 0.00 0.000000 0 4 getdents 0.00 0.000000 0 1 getcwd 0.00 0.000000 0 3 chdir 0.00 0.000000 0 1 unlink 0.00 0.000000 0 2 geteuid 0.00 0.000000 0 1 getegid 0.00 0.000000 0 2 getgroups 0.00 0.000000 0 4 prctl 0.00 0.000000 0 2 openat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 1599 42 total
I wonder if this output is helpful?
Best wishes: Michael
Am 18.04.2017 um 16:02 schrieb Jonathan Druart:
Use `strace -c -p PID` and paste the whole output please (hastebin or similar)
On Tue, 18 Apr 2017 at 10:23 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Marc and Jonatahn
Marc wrote:
What version of Koha is it?
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Do you have errors or messages in the log files? You can monitor them using tail. At first place I would have a look at opac-error.log and plack-error.log, something like: tail -f opac-error.log
No, they are no errors or messages. As soon as the login attempt in the Koha OPAC is made the, the following process taking 99% of the CPU can be seen with "top" or "ps -ef":
phsh-ko+ 62293 26131 30 14:39 ? 00:00:22 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
When the login finally succeeds (after maybe 30 seconds) the following is written in the file "plack.log":
178.238.175.156 - - [18/Apr/2017:14:40:28 +0200] "POST /opac/opac-user.pl HTTP/1.1" 200 26320 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-main.pl?logout.x=1" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0"
But nothing is written to the file "opac-error.log".
Jonathan wrote:
In production I am not aware of something you could do (well there is strace if you are patient and quick :D) In test you could use a code profiler or tell mysql to log slow queries. But the code profiler will be hazardous if it does not happens 100% of the time and htop would have told you if mysql was the culprit.
I activated the Mysql slow query log during runtime, using the following commands:
mysql> set global log_slow_queries = 1; mysql> set global slow_query_log_file = '/tmp/slow.txt';
Then I made another login attempt in the Koha OPAC. The file "/tmp/slow.txt" is created but it doesn't list any slow query, so I think the problem is not in the query itself:
# more /tmp/slow.txt /usr/sbin/mysqld, Version: 5.5.54-0+deb8u1 ((Debian)). started with: Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock Time Id Command Argument
Then I made some attempt to use strace, but I'm not sure how to really use it and especially how to interpret the output... So what i've tried is:
1. I made a login attempt
2. As soon as I saw the query was taking 99% of the CPU i picked the PID (here: 62503) from the output of "top" and executed the following command:
# strace -p 62503
The output then looked like the folloing (full output at the end of this email):
Process 62503 attached
followed by many many lines like the following:
brk(0xb63b000) = 0xb63b000 brk(0xb65c000) = 0xb65c000 brk(0xb67d000) = 0xb67d000
Then after about 30 seconds it says: [SKIP]
At this point the login attempt finally succeeds. Actually I have no idea what this means - can you please help?
Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 <+41%2061%20261%2055%2061> <+41%2061%20261%2055%2061> · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 <+41%2061%20261%2055%2061> · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Jonathan
I do not understand why %time is 0, should not be, especially if the process takes 30s Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both. You should write a script not to have to retrieve the pid manually, something using `ps aux | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2`
Here's another attempt: 1. I made a login attempt. 2. As soon as I saw the query that was taking 99% of the CPU I executed the following command in another terminal: strace -c -p $(ps aux | grep -v grep | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2) When the login finally succeeded I ended strace with CTRL+C. This is the output: Process 64958 attached ^CProcess 64958 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 5 read 0.00 0.000000 0 8 write 0.00 0.000000 0 2 close 0.00 0.000000 0 1 stat 0.00 0.000000 0 2 fstat 0.00 0.000000 0 3 lseek 0.00 0.000000 0 2 mmap 0.00 0.000000 0 534 brk 0.00 0.000000 0 66 2 rt_sigaction 0.00 0.000000 0 136 rt_sigprocmask 0.00 0.000000 0 1 1 ioctl 0.00 0.000000 0 1 select 0.00 0.000000 0 1 getsockopt 0.00 0.000000 0 1 kill 0.00 0.000000 0 1 fcntl 0.00 0.000000 0 1 chdir 0.00 0.000000 0 1 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 766 3 total Does this help?
Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both.
It is not possible to execute strace for a "normal" login attempt like this because usually an attempt to log into the Koha OPAC is very quick, so it is not possible to first click the "Login" button and then execute strace in the terminal because meanwhile the login has already succeeded and the above command will find no PID... Is it maybe possible to measure the execution time within the sections of the Perl script "opac-user.pl"? Somehow the whole thing seems to me like some kind of network problem (DNS? latency? timeout?) but I wouldn't know how to find out since I don't really understand what "opac-user.pl" does. Regards, Michael
There are now 766 calls vs 1599 in your previous paste. I would have expected more, less does not help :) If it is the pl script execution that takes 30s, it should not come from network latencies. On Tue, 18 Apr 2017 at 13:09 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Jonathan
I do not understand why %time is 0, should not be, especially if the process takes 30s Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both. You should write a script not to have to retrieve the pid manually, something using `ps aux | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2`
Here's another attempt:
1. I made a login attempt.
2. As soon as I saw the query that was taking 99% of the CPU I executed the following command in another terminal:
strace -c -p $(ps aux | grep -v grep | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2)
When the login finally succeeded I ended strace with CTRL+C. This is the output:
Process 64958 attached ^CProcess 64958 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 5 read 0.00 0.000000 0 8 write 0.00 0.000000 0 2 close 0.00 0.000000 0 1 stat 0.00 0.000000 0 2 fstat 0.00 0.000000 0 3 lseek 0.00 0.000000 0 2 mmap 0.00 0.000000 0 534 brk 0.00 0.000000 0 66 2 rt_sigaction 0.00 0.000000 0 136 rt_sigprocmask 0.00 0.000000 0 1 1 ioctl 0.00 0.000000 0 1 select 0.00 0.000000 0 1 getsockopt 0.00 0.000000 0 1 kill 0.00 0.000000 0 1 fcntl 0.00 0.000000 0 1 chdir 0.00 0.000000 0 1 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 766 3 total
Does this help?
Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both.
It is not possible to execute strace for a "normal" login attempt like this because usually an attempt to log into the Koha OPAC is very quick, so it is not possible to first click the "Login" button and then execute strace in the terminal because meanwhile the login has already succeeded and the above command will find no PID...
Is it maybe possible to measure the execution time within the sections of the Perl script "opac-user.pl"?
Somehow the whole thing seems to me like some kind of network problem (DNS? latency? timeout?) but I wouldn't know how to find out since I don't really understand what "opac-user.pl" does.
Regards, Michael
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
That's maybe just because it doesn't take the same amount of time with every attempt - sometimes it's 10 seconds, but (most of the time) it goes up to 40 seconds. Am 18.04.2017 um 18:23 schrieb Jonathan Druart:
There are now 766 calls vs 1599 in your previous paste. I would have expected more, less does not help :) If it is the pl script execution that takes 30s, it should not come from network latencies.
On Tue, 18 Apr 2017 at 13:09 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi Jonathan
I do not understand why %time is 0, should not be, especially if the process takes 30s Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both. You should write a script not to have to retrieve the pid manually, something using `ps aux | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2`
Here's another attempt:
1. I made a login attempt.
2. As soon as I saw the query that was taking 99% of the CPU I executed the following command in another terminal:
strace -c -p $(ps aux | grep -v grep | grep opac-user.pl | tr -s ' ' | cut -d' ' -f2)
When the login finally succeeded I ended strace with CTRL+C. This is the output:
Process 64958 attached ^CProcess 64958 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 5 read 0.00 0.000000 0 8 write 0.00 0.000000 0 2 close 0.00 0.000000 0 1 stat 0.00 0.000000 0 2 fstat 0.00 0.000000 0 3 lseek 0.00 0.000000 0 2 mmap 0.00 0.000000 0 534 brk 0.00 0.000000 0 66 2 rt_sigaction 0.00 0.000000 0 136 rt_sigprocmask 0.00 0.000000 0 1 1 ioctl 0.00 0.000000 0 1 select 0.00 0.000000 0 1 getsockopt 0.00 0.000000 0 1 kill 0.00 0.000000 0 1 fcntl 0.00 0.000000 0 1 chdir 0.00 0.000000 0 1 prctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 766 3 total
Does this help?
Try to get the most complete output with a long query, then a complete output with a normal query. Maybe we will see the differences between both.
It is not possible to execute strace for a "normal" login attempt like this because usually an attempt to log into the Koha OPAC is very quick, so it is not possible to first click the "Login" button and then execute strace in the terminal because meanwhile the login has already succeeded and the above command will find no PID...
Is it maybe possible to measure the execution time within the sections of the Perl script "opac-user.pl"?
Somehow the whole thing seems to me like some kind of network problem (DNS? latency? timeout?) but I wouldn't know how to find out since I don't really understand what "opac-user.pl" does.
Regards, Michael
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Hi We're running a productive host with Debian GNU/Linux 8 and Koha 16.11.04. One month ago I already described the case that on this host the Koha OPAC login process is often very slow. At such occasions the following process can be seen with "top", taking 99% of the CPU resources: phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl Nothing is written to the file "opac-error.log". I had already sent the output of strace in my e-mail from 18 April 2017 18:09 but this didn't help no further. See https://lists.katipo.co.nz/public/koha/2017-April/047804.html With my poor Perl knowledge I have now measured the execution time of the various parts of script "opac-user.pl". As I found out everything is always very fast (using less than 1 second execution time) except the very last function in this script which looks as follows: output_with_http_headers $cgi, $cookie, $template->output, $content_type; Usually this function takes around 2 seconds to complete, but in the case of the slow login process it takes around 32 seconds which makes me think there must be some kind of timeout (30 seconds). I tried to find out more but I am stuck in file "/usr/share/koha/lib/C4/Output.pm" where function "output_html_with_http_headers" makes use of another function called "output_with_http_headers". Can anyone please give me a hint how I could dig deeper to maybe find out why the script "opac-user.pl" sometimes takes so long to complete its task? Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Excerpts from Michael Kuhn's message of 2017-05-09 19:57:51 +0200:
With my poor Perl knowledge I have now measured the execution time of the various parts of script "opac-user.pl". As I found out everything is always very fast (using less than 1 second execution time) except the very last function in this script which looks as follows:
output_with_http_headers $cgi, $cookie, $template->output, $content_type;
I am not familiar with this code, but I took a quick look at it just now. If I were trying to debug this problem, I would measure the execution time of the last statement in output_with_http_headers: print $query->header($options), $data; If I understand correctly, this print is going to send a big blob of HTML (preceded by a header) via Apache to the client. If Apache is timing out for some reason, perhaps that would cause a big delay in this print statement.
Hi Yesterday Mark Alexander wrote:
Excerpts from Michael Kuhn's message of 2017-05-09 19:57:51 +0200:
With my poor Perl knowledge I have now measured the execution time of the various parts of script "opac-user.pl". As I found out everything is always very fast (using less than 1 second execution time) except the very last function in this script which looks as follows:
output_with_http_headers $cgi, $cookie, $template->output, $content_type;
I am not familiar with this code, but I took a quick look at it just now. If I were trying to debug this problem, I would measure the execution time of the last statement in output_with_http_headers:
print $query->header($options), $data;
If I understand correctly, this print is going to send a big blob of HTML (preceded by a header) via Apache to the client. If Apache is timing out for some reason, perhaps that would cause a big delay in this print statement.
It doesn't seem to be the function itself that takes the time. I have now changed the scripts "output-user.pl" and "Output.pm" as follows so they will give me the time at certain moments: 1. Print time at the very beginning of script "output-user.pl" 2. Print time just before calling function "output_html_with_http_headers" in script "output-user.pl" 3. Print time at the very beginning of script "Output.pm" 4. Print time at the very beginning in function "output_html_with_http_headers" in script "Output.pm" 5. Print time at the end in function "output_html_with_http_headers" in script "Output.pm" 6. Print time after calling function "output_html_with_http_headers" in script "output-user.pl" I would have expected the output in this order, but in fact I get the following sequence: 3. 09:18:44 START (Output.pm called by plack.psgi) 1. 09:18:44 START opac-user.pl / 1494407924.86717 2. 09:18:45 START opac-user.pl (right before function) / 1494407925.45211 * 4. 09:19:17 START function (in Output.pm) called by opac-user.pl * 5. 09:19:17 END function (in Output.pm) called by opac-user.pl * TIME function: 0.0027921199798584 6. 09:19:17 END opac-user.pl / 1494407957.18462 TIME : 32.3174500465393 If I disable Plack I get the following sequence (no output for 3.): 1. 09:20:48 START opac-user.pl / 1494408048.50046 2. 09:20:48 START opac-user.pl (right before function) / 1494408048.74332 * 4. 09:21:20 START function (in Output.pm) called by opac-user.pl * 5. 09:21:20 END function (in Output.pm) called by opac-user.pl * TIME function: 0.00142717361450195 6. 09:21:20 END opac-user.pl / 1494408080.4559 TIME : 31.9554369449615 However everything works very fast and fine in script "opac-user.pl" until the function "output_html_with_http_headers" is called. Then it takes about 32 seconds until the function "output_html_with_http_headers" actually starts its work, the function itself then works very fast again. Does anybody have an idea what could cause this very annoying delay before the function is actually starting its work? I'm not sure if there is a connection but when I restart Koha using "service koha-common" it is sometimes very fast and sometimes it takes about 30 seconds too. Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
output_with_http_headers is a function that calls the templates code. I would bet it is an IO problem. Please run iotop while reproducing the problem. I would expect to see some iowait. Do you have template_cache_dir set? Is it on a fast hard drive? You could try mounting a tmpfs filesystem in there and compare results. Also for debugging you should better use plack's tools instead: enable 'Debug', panels => [ qw(DBITrace Memory Timer) ]; In plack.psgi and even the NTYProf panel. Maybe we need to improve the docs for this. El mié., 10 de may. de 2017 6:25 AM, Michael Kuhn <mik@adminkuhn.ch> escribió:
Hi
Yesterday Mark Alexander wrote:
Excerpts from Michael Kuhn's message of 2017-05-09 19:57:51 +0200:
With my poor Perl knowledge I have now measured the execution time of the various parts of script "opac-user.pl". As I found out everything is always very fast (using less than 1 second execution time) except the very last function in this script which looks as follows:
output_with_http_headers $cgi, $cookie, $template->output, $content_type;
I am not familiar with this code, but I took a quick look at it just now. If I were trying to debug this problem, I would measure the execution time of the last statement in output_with_http_headers:
print $query->header($options), $data;
If I understand correctly, this print is going to send a big blob of HTML (preceded by a header) via Apache to the client. If Apache is timing out for some reason, perhaps that would cause a big delay in this print statement.
It doesn't seem to be the function itself that takes the time.
I have now changed the scripts "output-user.pl" and "Output.pm" as follows so they will give me the time at certain moments:
1. Print time at the very beginning of script "output-user.pl"
2. Print time just before calling function "output_html_with_http_headers" in script "output-user.pl"
3. Print time at the very beginning of script "Output.pm"
4. Print time at the very beginning in function "output_html_with_http_headers" in script "Output.pm"
5. Print time at the end in function "output_html_with_http_headers" in script "Output.pm"
6. Print time after calling function "output_html_with_http_headers" in script "output-user.pl"
I would have expected the output in this order, but in fact I get the following sequence:
3. 09:18:44 START (Output.pm called by plack.psgi) 1. 09:18:44 START opac-user.pl / 1494407924.86717 2. 09:18:45 START opac-user.pl (right before function) / 1494407925.45211 * 4. 09:19:17 START function (in Output.pm) called by opac-user.pl * 5. 09:19:17 END function (in Output.pm) called by opac-user.pl * TIME function: 0.0027921199798584 6. 09:19:17 END opac-user.pl / 1494407957.18462 TIME : 32.3174500465393
If I disable Plack I get the following sequence (no output for 3.):
1. 09:20:48 START opac-user.pl / 1494408048.50046 2. 09:20:48 START opac-user.pl (right before function) / 1494408048.74332 * 4. 09:21:20 START function (in Output.pm) called by opac-user.pl * 5. 09:21:20 END function (in Output.pm) called by opac-user.pl * TIME function: 0.00142717361450195 6. 09:21:20 END opac-user.pl / 1494408080.4559 TIME : 31.9554369449615
However everything works very fast and fine in script "opac-user.pl" until the function "output_html_with_http_headers" is called. Then it takes about 32 seconds until the function "output_html_with_http_headers" actually starts its work, the function itself then works very fast again.
Does anybody have an idea what could cause this very annoying delay before the function is actually starting its work?
I'm not sure if there is a connection but when I restart Koha using "service koha-common" it is sometimes very fast and sometimes it takes about 30 seconds too.
Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Tomás Cohen Arazi Theke Solutions (https://theke.io <http://theke.io/>) ✆ +54 9351 3513384 GPG: B2F3C15F
Hi Tomás Thanks for your hints! You wrote:
output_with_http_headers is a function that calls the templates code. I would bet it is an IO problem.
Please run iotop while reproducing the problem. I would expect to see some iowait.
I installed the command "iotop" and ran it when reproducing the problem. But it shows nothing special: Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 64690 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.03 % [kworker/1:2] 64000 be/4 www-data 0.00 B/s 0.00 B/s 0.00 % 0.00 % apache2 -k start 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd] 3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % ksoftirqd/0] 1540 be/4 Debian-g 0.00 B/s 0.00 B/s 0.00 % 0.00 % gnome-she~nf worker] 5 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0H]
Do you have template_cache_dir set?
I added the following line in file "koha-conf.xml" just after the two memcached lines: <memcached_servers></memcached_servers> <memcached_namespace></memcached_namespace> <template_cache_dir>/tmp</template_cache_dir> Then I restarted Koha. The problem persists. I couldn't find anything special in directory "/tmp". But since I don't know what "template_cache_dir" should do I don't know what to expect from it...
Is it on a fast hard drive? You could try mounting a tmpfs filesystem in there and compare results.
The Koha host resides in a virtual machine (run by Hyper-V, using Integrated Services). Unfortunately I have no direct access to the console there.
Also for debugging you should better use plack's tools instead:
enable 'Debug', panels => [ qw(DBITrace Memory Timer) ];
In plack.psgi and even the NTYProf panel. Maybe we need to improve the docs for this.
I added this line to file "plack.psgi" and restarted Koha. Koha wasn't responding anymore but the file "plack-error.log" said: Error while loading /etc/koha/plack.psgi: Can't locate Plack/Middleware/Debug.pm in @INC (you may need to install the Plack::Middleware::Debug module) (@INC contains: /usr/share/koha/lib/installer /usr/share/koha/lib /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.20.2 /usr/local/share/perl/5.20.2 /usr/lib/x86_64-linux-gnu/perl5/5.20 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.20 /usr/share/perl/5.20 /usr/local/lib/site_perl .) at /etc/koha/plack.psgi line 29. I tried to add "use Plack::Middleware::Debug;" to the file but it turned out this Perl module isn't part of the Koha installation, so I installed it: # apt-get install libplack-middleware-debug-perl Then I activated the following lines in file "plack.psgi". use Plack::Middleware::Debug; ... enable 'Debug', panels => [ qw(DBITrace Memory Timer) ]; Then I restarted Koha. This gave me no additional output in logfile "plack-error.log". My time measurement still gives the following: 1. 12:45:14 START opac-user.pl / 1494420314.20786 2. 12:45:14 START opac-user.pl (right before function) / 1494420314.41777 * 4. 12:45:47 START function (in Output.pm) called by opac-user.pl * 5. 12:45:47 END function (in Output.pm) called by opac-user.pl * TIME function: 0.00157809257507324 6. 12:45:47 END opac-user.pl / 1494420347.53212 TIME : 33.3242571353912 When everything is done the following lines appear in file "plack.log". 178.238.175.156 - - [10/May/2017:14:45:47 +0200] "POST /opac/opac-user.pl HTTP/1.1" 200 319007 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-main.pl?logout.x=1" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0" 178.238.175.156 - - [10/May/2017:14:45:48 +0200] "GET /opac/errors/404.pl HTTP/1.1" 404 205852 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-user.pl" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0" 178.238.175.156 - - [10/May/2017:14:45:48 +0200] "GET /opac/errors/404.pl HTTP/1.1" 404 205852 "http://bibliothek.phsh.ch/cgi-bin/koha/opac-user.pl" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0" Does this make some sense to you? Best wishes: Michael (slowly going nuts...) -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Hi We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04. On 13 April I first wrote that sometimea when trying to login into the Koha OPAC it takes more than 30 seconds until the login process finally comes to an end, while the process "opac-user.pl" was eating up 99% of the CPU. Some of you have kindly given my some hints but everything I tried was of no avail: logging mySQL slow queries, iotop, strace -c -p PID, tcpdump, checking logs (plack-error.log, opac-error.log etc etc), executing various time measurements of the Perl scripts, deactivating other programs (like Plack, EZproxy etc), enabling debugging for Plack, etc etc Eventually I executed "strace -t -s 1024 -p PID" and tried to understand the output. I doubt I did but some lines gave me the idea the problem was connected with the card expiry date. The user data were migrated from another ILS which contained a card expiry date of 9999-12-31 for every user - I manually changed this to 2030-12-31 via SQL and all of a sudden the login process was quick as experienced elsewhere. So the problem feels like SOLVED by a workaround, at least in our Koha instance! But meanwhile we have found out the following: 1. 9999-12-31 is a valid date accepted by MySQL (see https://dev.mysql.com/doc/refman/5.5/en/datetime.html) 2. The newly configured patron categories in our Koha instance all have an enrollment period of 999 months (strangely it's not possible to choose a higher number). 3. The migrated user data was imported via Koha menu "Tools > Patrons and circulation > Import patrons". NO ERROR occurred and all the data (including the date 9999-12-31) was correctly written into table "borrowers". 4. In Koha menu "Patrons > New patron" when a new patron is added with expiry date "31.12.9999" this is accepted WITHOUT ERROR. (the local Koha system preference "datetime" is set to "dd.mm.yyyy") 5. Trying to login into the Koha OPAC will immediately lead to described problem with the delayed login. 6. I tried step 4 on my own demo installation of Koha 16.11.00 with expiry date "31/12/9999" - there a dialogue appears saying "Invalid year entered in field dateexpiry". There is only a button "OK" - I clicked that and then saved the user data. This worked without further errors and the date can be found in field "borrowers.dateexpiry" as "9999-12-31". It seems rather strange to me that Koha should warn about an "invalid year" (whatever that means) but accepting it nonetheless. Trying to login with this user in the OPAC of my demo installation works quick as expected. To me the behavior (described in steps 1-5) feels like a bug because if date 9999-12-31 is accepted by the patron import tool and also by MySQL then it shouldn't be necessary to change that date. (But it seems this problem is only existing on our Koha instance) Also the behavior (described in step 6) feels at least like a minor bug because if Koha warns of an "invalid year" it should 1) be more explicit (what exactly is invalid about the year?) and especially it shouldn't allow to save that "invalid" year. What do you think of this? Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
I still do not recreate the problem. ^9999 dates are handled in our Koha::DateUtils module, take a look at: 223 # FIXME: see bug 13242 => no TZ for dates 'infinite' 224 if ( $dt->ymd !~ /^9999/ ) { 225 my $tz = $dateonly ? DateTime::TimeZone->new(name => 'floating') : C4::Context->tz; 226 $dt->set_time_zone( $tz ); 227 } Well there is a fixme, but basically it says that if the date is from 9999 we should use floating dates to avoid long processing when comparing dates. Take a look at 12669, 13242 and https://metacpan.org/pod/DateTime#Determining-the-Local-Time-Zone-Can-Be-Slo... I guess there is a use of DateTime using the tz somewhere in the code. On Thu, 18 May 2017 at 17:22 Michael Kuhn <mik@adminkuhn.ch> wrote:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
On 13 April I first wrote that sometimea when trying to login into the Koha OPAC it takes more than 30 seconds until the login process finally comes to an end, while the process "opac-user.pl" was eating up 99% of the CPU.
Some of you have kindly given my some hints but everything I tried was of no avail: logging mySQL slow queries, iotop, strace -c -p PID, tcpdump, checking logs (plack-error.log, opac-error.log etc etc), executing various time measurements of the Perl scripts, deactivating other programs (like Plack, EZproxy etc), enabling debugging for Plack, etc etc
Eventually I executed "strace -t -s 1024 -p PID" and tried to understand the output. I doubt I did but some lines gave me the idea the problem was connected with the card expiry date. The user data were migrated from another ILS which contained a card expiry date of 9999-12-31 for every user - I manually changed this to 2030-12-31 via SQL and all of a sudden the login process was quick as experienced elsewhere. So the problem feels like SOLVED by a workaround, at least in our Koha instance!
But meanwhile we have found out the following:
1. 9999-12-31 is a valid date accepted by MySQL (see https://dev.mysql.com/doc/refman/5.5/en/datetime.html)
2. The newly configured patron categories in our Koha instance all have an enrollment period of 999 months (strangely it's not possible to choose a higher number).
3. The migrated user data was imported via Koha menu "Tools > Patrons and circulation > Import patrons". NO ERROR occurred and all the data (including the date 9999-12-31) was correctly written into table "borrowers".
4. In Koha menu "Patrons > New patron" when a new patron is added with expiry date "31.12.9999" this is accepted WITHOUT ERROR. (the local Koha system preference "datetime" is set to "dd.mm.yyyy")
5. Trying to login into the Koha OPAC will immediately lead to described problem with the delayed login.
6. I tried step 4 on my own demo installation of Koha 16.11.00 with expiry date "31/12/9999" - there a dialogue appears saying "Invalid year entered in field dateexpiry". There is only a button "OK" - I clicked that and then saved the user data. This worked without further errors and the date can be found in field "borrowers.dateexpiry" as "9999-12-31". It seems rather strange to me that Koha should warn about an "invalid year" (whatever that means) but accepting it nonetheless. Trying to login with this user in the OPAC of my demo installation works quick as expected.
To me the behavior (described in steps 1-5) feels like a bug because if date 9999-12-31 is accepted by the patron import tool and also by MySQL then it shouldn't be necessary to change that date. (But it seems this problem is only existing on our Koha instance)
Also the behavior (described in step 6) feels at least like a minor bug because if Koha warns of an "invalid year" it should 1) be more explicit (what exactly is invalid about the year?) and especially it shouldn't allow to save that "invalid" year.
What do you think of this?
Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 <+41%2061%20261%2055%2061> · E mik@adminkuhn.ch · W www.adminkuhn.ch _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi, If you are running plack, you will want to be looking in the plack-error.log. Cheers, Liz On 10/05/17 05:57, Michael Kuhn wrote:
Hi
We're running a productive host with Debian GNU/Linux 8 and Koha 16.11.04.
One month ago I already described the case that on this host the Koha OPAC login process is often very slow. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
Nothing is written to the file "opac-error.log".
I had already sent the output of strace in my e-mail from 18 April 2017 18:09 but this didn't help no further. See https://lists.katipo.co.nz/public/koha/2017-April/047804.html
With my poor Perl knowledge I have now measured the execution time of the various parts of script "opac-user.pl". As I found out everything is always very fast (using less than 1 second execution time) except the very last function in this script which looks as follows:
output_with_http_headers $cgi, $cookie, $template->output, $content_type;
Usually this function takes around 2 seconds to complete, but in the case of the slow login process it takes around 32 seconds which makes me think there must be some kind of timeout (30 seconds).
I tried to find out more but I am stuck in file "/usr/share/koha/lib/C4/Output.pm" where function "output_html_with_http_headers" makes use of another function called "output_with_http_headers".
Can anyone please give me a hint how I could dig deeper to maybe find out why the script "opac-user.pl" sometimes takes so long to complete its task?
Best wishes: Michael
-- -- Liz Rea Catalyst.Net Limited Level 6, Catalyst House, 150 Willis Street, Wellington. P.O Box 11053, Manners Street, Wellington 6142 GPG: B149 A443 6B01 7386 C2C7 F481 B6c2 A49D 3726 38B7
Hi Liz
If you are running plack, you will want to be looking in the plack-error.log.
Thanks for the hint. I have already been looking there (and many other log files), but to no avail. See https://lists.katipo.co.nz/public/koha/2017-April/047796.html Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
Hi Michael, A more general hint about mailing lists: It seems that you started this thread by answering to a mail from an other thread (Re: [Koha] [Koha-devel] What's on in koha-devel #11 Tue, 11 Apr 2017 21:49:51 +0000). See https://lists.katipo.co.nz/pipermail/koha/2017-April/thread.html as an illustration for what I mean. So it is better to start a thread with a brand new mail to make it a "real" thread. It makes it easier for others to follow the thread and/or to find the thread in the archives. Kind regards Marc Am 13.04.2017 um 20:13 schrieb Michael Kuhn:
Hi
We're running a new host with Debian GNU/Linux 8 and Koha 16.11.04.
Usually the Koha OPAC login process is quick as expected. But sometimes when trying to login it takes 10 to 40 seconds until the login process finally comes to an end. At such occasions the following process can be seen with "top", taking 99% of the CPU resources:
phsh-ko+ 59422 56031 11 16:30 ? 00:00:35 /usr/share/koha/opac/cgi-bin/opac/opac-user.pl
This happens more often when trying to login from the internal network, and less often when trying to login from the external network (i. e. the internet).
Has anyone experienced this problem before and can maybe give us a hint how to tame "opac-user.pl"?
Best wishes and Happy Easter: Michael _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Hi Marc
A more general hint about mailing lists:
It seems that you started this thread by answering to a mail from an other thread (Re: [Koha] [Koha-devel] What's on in koha-devel #11 Tue, 11 Apr 2017 21:49:51 +0000). See https://lists.katipo.co.nz/pipermail/koha/2017-April/thread.html as an illustration for what I mean.
So it is better to start a thread with a brand new mail to make it a "real" thread. It makes it easier for others to follow the thread and/or to find the thread in the archives.
Thanks for the hint! In https://lists.katipo.co.nz/pipermail/koha/2017-April/thread.html I see what you mean. I may have "answered" that mail (Re: [Koha] [Koha-devel] What's on in koha-devel #11 Tue, 11 Apr 2017 21:49:51 +0000) to get the e-mail address of the Koha mailing list, but I then changed the subject to "Slow Koha OPAC login process" and my mail client (Mozilla Thunderbird) does show the correct subject. However I found the following line in my mail that may produce this wrong thread: In-Reply-To: <CAJzKNY79pBxhyPSpSrdGEo9jp9_bs4Z=acZU7juzmwXGSxCViw@mail.gmail.com> So I will try to avoid this in the future by actually creating a new mail and enter the mailing list address by hand. Best wishes: Michael -- Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz T 0041 (0)61 261 55 61 · E mik@adminkuhn.ch · W www.adminkuhn.ch
participants (6)
-
Jonathan Druart -
Liz Rea -
Marc Véron -
Mark Alexander -
Michael Kuhn -
Tomas Cohen Arazi