Debug an Emacs crash

Run Emacs
Invoke GDB
- Debug a running Emacs process
- Run Emacs under GDB
Type continue, or do not type continue?
- If Emacs crashes
- If Emacs hangs (or seems to be stuck in some infinite loop)
Show the backtrace
- Go to the correct stack frame
- Last keys and last command
Lisp backtrace
Find source of problems
Report a bug in GNU Emacs
Follow a bug report
- Reply to a bug
- Close a bug
Read more…

Run Emacs

On http://sourceforge.net/projects/emacs-bin/files/releases/, you can find optimized Windows binaries with debug info.

On https://sourceforge.net/projects/emacs-bin/files/snapshots/debug/, you can unoptimized builds.

Invoke GDB

Instead of using GDB, try first to use C-g with debug-on-quit set to t… It might work!

Debug a running Emacs process

Attach GDB (the GNU debugger) to Emacs.

$ cd /cygdrive/c/Program\ Files\ \(x86\)/emacs-trunk/bin
$ gdb -p <emacs-PID>

Run Emacs under GDB

Running Emacs under GDB is not too different from attaching the debugger, if you attach it before the problem happens. But the first way is preferable because it keeps the standard streams of emacs.exe connected to the console where you run GDB, so GDB commands, such as pp, which call functions inside Emacs, still work.

When you attach the debugger to a running Emacs, the standard streams are not available, and pp and its ilk will appear to do nothing.

$ cd /cygdrive/c/Program\ Files\ \(x86\)/emacs-trunk/bin
$ gdb ./emacs.exe
...
(gdb) set debugexceptions 1
(gdb) run -Q

When the problem happens, I expect that GDB will show us the exception name (that's what the above setting is about).

Don't use the F12 key binding in Emacs! Otherwise, you will have false alarms: F12 (on Windows) causes an immediate break into the debugger. This is not a GDB thing, it's a Windows thing.

Type continue, or do not type continue?

That is the question.

If Emacs crashes

When you attach GDB to Emacs that's crashed, and see an abort dialog, you must type:

(gdb) continue

to stop Emacs again when it gets a fatal signal.

In this case, wait for another GDB prompt, and only then collect the backtrace from all threads.

Otherwise, you get a backtrace that is not useful: without continue, the backtrace would just show that it is in an exception handler, which is not useful.

If Emacs hangs (or seems to be stuck in some infinite loop)

Do not type continue if you attached GDB to Emacs that appears to be hung.

Otherwise, Emacs would be again running, and since there's no fatal signal, typing continue would just let Emacs continue its infinite loop. GDB would not get control.

So, just collect the backtrace from all threads.

No, please type "finish" before invoking "bt" or "xbacktrace", because these invoke functions inside Emacs, which hit some assertion violation in this situation.

If the symptom of the bug is that Emacs fails to respond

c:/Program Files (x86)/emacs-24.4/share/emacs/24.4/etc/DEBUG

Don't assume Emacs is `hung'–it may instead be in an infinite loop. To find out which, make the problem happen under GDB and stop Emacs once it is not responding. (If Emacs is using X Windows directly, you can stop Emacs by typing C-z at the GDB job.) Then try stepping with `step'. If Emacs is hung, the `step' command won't return. If it is looping, `step' will return.

If this shows Emacs is hung in a system call, stop it again and examine the arguments of the call. If you report the bug, it is very important to state exactly where in the source the system call is, and what the arguments are.

If Emacs is in an infinite loop, try to determine where the loop starts and ends. The easiest way to do this is to use the GDB command `finish'. Each time you use it, Emacs resumes execution until it exits one stack frame. Keep typing `finish' until it doesn't return–that means the infinite loop is in the stack frame which you just tried to finish.

Stop Emacs again, and use `finish' repeatedly again until you get back to that frame. Then use `next' to step through that frame. By stepping, you will see where the loop starts and ends. Also, examine the data being used in the loop and try to determine why the loop does not exit when it should.

Show the backtrace

Typing explicitly:

(gdb) thread 1

right after attaching to the process might have produce the C-level backtrace for the main thread. It is a good thing to do in any case when attaching to Emacs process that's in trouble.

Collect the backtrace from all threads, like this:

(gdb) thread apply all backtrace

and post to gnu.emacs.bug everything it displays.

bt full is not useless, but it more often than not brings a lot of information, most of which is unneeded, and just clutters the message. I prefer just bt at first, and only later bt full at select frames, if ever. It is much more efficient to print the values of the few relevant variables, rather than all the locals.

Go to the correct stack frame

If you need to be in Thread 1 in frame #2 (the frame that called emacs_abort, for example):

(gdb) thread 1
(gdb) frame 2

Last keys and last command

If there's no usable information in the backtrace, please show the last keys you typed, they should be available in the recent_keys Lisp vector, a circular queue, while recent_keys_index - 1 is the index of the last key.

Also, Vthis_command and Vreal_this_command should tell which command was being executed at the time of the crash.

Lisp backtrace

Such backtraces are not useful without the Lisp backtrace part. What Emacs developers need is names of some Lisp functions that we could then examine in order to look for potential infloops. That's impossible without Lisp backtrace.

To get a Lisp backtrace, you need to:

Instruct GDB to read the .gdbinit file (where the Emacs-specific x* commands are defined) by using the source command:
```
(gdb) source .gdbinit
```
Type:
```
(gdb) xbacktrace
```
and show the results.

Emacs versions

Official releases and pretest binaries are provided with stripped executables, so a .gdbinit file there will not help.

Development snapshots are not stripped of the debug info.

Get a .gdbinit for that version of Emacs

Having a .gdbinit file from the wrong Emacs version is a problem. So only use .gdbinit with the Emacs version it matches.

If you don't have the .gdbinit file to go with the binary you are running, get a matching .gdbinit file for the development version you use from the bzr repository (src/ subdirectory).

If you put this .gdbinit file in the directory where you start GDB, then that file is read.

Otherwise, when you enter GDB, just type source .gdbinit to load it.

Nuisance with latest GDB versions

To countermand the warning "auto-loading has been declined by your `auto-load safe-path' set to `$debugdir:$datadir/auto-load'", have this:

set auto-load safe-path /

in your ~/.gdbinit (always read).

Find source of problems

Find where Emacs loops

When Emacs is hung for some reason, the backtrace doesn't show where it is hung or inflooping, because attaching to such a process catches it in some random place. What is needed is information about where Emacs loops.

If Emacs is in an infinite loop, try to determine where the loop starts and ends. The easiest way to do this is to use the GDB command finish (.gdbinit not needed). Each time you use it, Emacs resumes execution until it exits one stack frame. Keep typing finish until it doesn't return – that means the infinite loop is in the stack frame which you just tried to finish.

Before you start with finish, you need to switch to the main thread. This is usually thread 1, so you type:

(gdb) thread 1

and then start the finish dance.

To make sure thread 1 is the main thread, verify that thread apply all backtrace ends with the main function for thread 1, like this:

Thread 1 (Thread 5552.0xb8c):
#0  0x7c91120f in ntdll!DbgUiConnectToDbg () from /cygdrive/c/WINDOWS/system32/ntdll.dll
[...]
#58 0x01004f54 in Frecursive_edit () at keyboard.c:846
#59 0x01002b68 in main (argc=1, argv=0xa43e18) at emacs.c:1655

Report which finish command doesn't return.

Find memory problem

Find large memory allocation (in GDB)

Put a break at xmalloc conditioned by some large allocation size might show who is requesting this much memory.

(gdb) break xmalloc if size > 100000
(gdb) commands
       > bt
       > continue
       > end

Do similarly for xzalloc and xrealloc.

Then run Emacs as usual, and the code which allocates these large chunks of memory will be shown in the backtraces.

Check Emacs memory usage (in Emacs)

In the unlikely case the memory is used by Elisp objects (rather than by C-level objects), you can install memory-usage (available from GNU ELPA) and do M-x memory-usage when the process's size is suspicious (assuming that at that point Emacs is still sufficiently usable to run the above command).

See who loaded a file

Put a breakpoint at Fload, using the following setup:

(gdb) break Fload
(gdb) commands
  > p file
  > xstring
  > end
(gdb) r -Q

Now, when the breakpoint breaks, you will see which file is being loaded, and need just type c RET (or just RET after the first time), and wait the file to show in the output of the above commands; then type bt to see who loaded it.

Report a bug in GNU Emacs

Post all this information via M-x report-emacs-bug (or write a mail to bug-gnu-emacs@gnu.org), so that a bug will be filed on the Emacs bug tracker. It will also show which packages you have loaded.

Also, try to remember what were you doing immediately before the incident, and in what mode was the current buffer.

Follow a bug report

See the Emacs Bug Tracker.

Reply to a bug

You can't reply to a bug from the Web interface, but you can:

write a mail to <bug-number>@debbugs.gnu.org to reply, or
try M-x gnus-read-ephemeral-emacs-bug-group to browse Emacs bugs IDs as an ephemeral group.

Close a bug

Feel free to close your own bugs. Just mail <bug-number>-done@debbugs.gnu.org rather than <bug-number>@debbugs.gnu.org.