Quick cheat sheet for strace

Today I've noticed that the gwenview (a GUI program to view images) was sitting on CPU even though I wasn't using that. This reminded me of an idea to practice my tracing skills.

For those, who don't know what am I talking about, in Linux you can use a very special kind of software, called debuggers or tracers, connect to any existing process (or spawn a new one) and see, what exactly it is doing! Not exactly like reading the source code of the program, but very close to that. Let's say some developer did a silly app, which does nothing rather opening some file, writing some bullshit into it and closing it. In a loop. So if you run that program with using strace you would be able to see, that it calls specific functions from Linux kernel like fopen / fclose / fwrite. I'm oversimplifying off course, but you get the idea.

So let's install strace and practice a bit:

apt install strace

And run it for some process (assuming you have a process with pid=70014):

sudo strace -p 70014 -r -o ./strace.log

Then after a short while press CTRL+C.
Let's break here for a while and see what options we used.

-p option tells strace to connect to an existing process
-r instructs it to add to the output a time difference between calls
-o <filename> redirects all the output to a file

Let's see what we've got in our log file. Interesting, huh? But it's a raw data, a lot of raw calls to some kernel functions. How can we get any additional use out of it?

We can ask strace to give us a summary report, like what kind of kernel functions were called and how long did all that took, in total for each function:

pi@orangepi4-lts:~$ sudo strace -c -p 70014
strace: Process 70014 attached
^Cstrace: Process 70014 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 42.47    0.178028          11     15074     14139 openat
 21.86    0.091656          25      3615           read
 11.62    0.048727           8      5925         2 newfstatat
  7.75    0.032476          10      3068           write
  5.08    0.021315          24       862        22 faccessat
  4.28    0.017959          19       945           ppoll
  2.47    0.010357          11       935           close
  1.37    0.005738           9       614        59 statx
  1.32    0.005552           6       924           fstat
  1.10    0.004626          27       168           getdents64
  0.61    0.002575           9       271           ioctl
  0.03    0.000109          10        10           lseek
  0.01    0.000047           9         5           getuid
  0.01    0.000043           8         5           geteuid
  0.00    0.000012          12         1           clone
  0.00    0.000000           0         2           futex
------ ----------- ----------- --------- --------- ----------------
100.00    0.419220          12     32424     14222 total
 

Here you can see, the most time this program spent opening some files with openat function, then reading something out of them with read and then getting file info with newfsatat

Knowing these facts, we can return back to inspecting raw log, but giving this time more scrutiny to what kind of files this pesky gwenview is reading all the time:

grep -E "openat|read|newfstatat" strace.log | less

It now becomes clear, that gwenview spending a lot of CPU while  attempting to open some thumbnails, usually it does that more than once for every thumbnail file, but gets an error from Kernel each time, but it repeats to try that over and over again:

     0.000093 openat(AT_FDCWD, "/home/pi/.cache/thumbnails/large/cd736b031fc7750f7d7ee3ca307
cee8e.png.pgm", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

To me it's clearly a bug of gwenview it shouldn't act like this. It also consumed a lot of memory and went into swap. Sadly, I liked it much. But we can report this bug or check if the most recent version of that application behaves the same.

What could have bothered it, is that I opened it to display images in a folder, which was getting updated in a background - I do have an app on my android phone which syncs my images from Android gallery to my Linux PC over ssh.

I wanted you to look at this article - where a wonderful mate Raghu nailed that whole strace topic down

Update from future: I reported that bug and it was fix the very same day by magnificent The other behavior I was seeing (consuming a lot of memory first, then swap, then slowly making system to die) was  already fixed before But given Debian don't update packages all that often, a vast majority of its users (like me) won't see that fix.  

No comments:

Post a Comment

Start here

Disable Firefox from updating itself and flash those annoying "Restart to Keep Using Firefox" messages on you

I recently switched from Brave to Firefox. Just because Brave appeared to be some proprietary shit, even though they're masking themselv...