AutomatedProblemReportsNotification

Revision 1 as of 2006-07-29 07:06:47

Clear message

Summary

This spec describes giving visual feedback when AutomatedProblemReports are triggered.

Rationale

This is largely just for eye candy; however, when a user is automatically reporting problems, the more security sensitive data may need review. Alerting the user to crashes and delivery of reports is a good thing if you want to draw attention here.

Use cases

  • Rhythmbox spontaneously crashes, and the end user notices a "crash icon" in his Notification Area.

Scope

The scope of this spec includes things reported by AutomatedProblemReports.

Design

A tool similar to update-notifier should exist to give visual feedback of AutomatedProblemReports through the notification area. This tool should display different icons depending on the current situation, for example:

  • No icon when nothing is waiting.
  • Information icon when new crashes have been sent but extra information (stack dumps) needs review.

    • Crashes are "new" when no other crashes appear to be from the same cause for a certain time period (i.e. a month).
    • For the time being, let's limit the crashes we identify as the same:
      • Stack Smashes that call __stack_chk_fail() from the same function

      • Heap corruptions and double-free() detections that had malloc() or free() called from the same application function.

  • A crash icon for when a crash was detected

    • Should look rather harmless, perhaps a spiky ball
    • We may want to pulse this every few seconds if automatic reports aren't being sent
  • A vulnerability icon when a vulnerability is suspected

    • Should look somewhat vicious, perhaps in a red circle, with an '!' incorporated
    • We may want to pop a balloon up for this one that says, "Possible security vulnerability detected, click here for more information," when automatic reporting is off
    • Should probably allow users to have specifically suspected security vulnerabilities automatically reported even if they don't report everything
  • Add a communication indicator when a report is being sent (automatically or user-triggered)

    • Should overlay crash icon

Implementation

glibc:

  • Die from __stack_chk_fail() in glibc in a way we can detect, one of the two below:

    • call fork(), kill(getpid(), SIGSTOP), system()

      • fork() creates a new child, which execve()s.

        • The child process is the AutomatedProblemReports crash handler, and gathers debug data.

        • The child process should be instructed to send SIGKILL to the parent when finished.

      • SIGSTOP can't be blocked, trapped, or ignored; and makes a whole thread group unschedulable.

        • Keep stopping in a loop in case some fool keeps trying to reanimate the process with SIGCONT.

    • Suicide with kill(getpid(),SIGKILL) in __stack_chk_fail()

  • Call the AutomatedProblemReports reporter when heap badness occurs.

    • Report the difference between double free(), corrupt heap chains when walked by malloc(), and free() landing in bad data (indicates either heap overflows or just a bad address passed to free()).

    • It should be safe to use the same fork() method as suggested with the stack smash handler.

Code

Code will be needed in the areas described in Implementation.

We will need AutomaticProblemReports working first.

The AutomatedProblemReportsTagging specification must be approved and implemented.

Data preservation and migration

No issues exist.

Unresolved issues

Priority

Priorities are difficult to assign; we can find and fix certain bugs quickly, but those bugs are often the least of our problems. Stack smashes and double-free()s for example are blatantly obvious, but are also being trapped and stopped; while a simple SIGILL could indicate the instruction pointer got moved to the middle of an instruction, and thus possibly bad data can alter program flow, creating an undetected security breech.

In the end this means that if we give priority to 'blatantly obvious' then we will spend more time fixing hard-to-execute vulnerabilities than working on what we believe may be easy-to-execute vulnerabilities; but if we give priority to studying potential vulnerabilities then 'known vulnerabilities' are suddenly not being given the utmost priority.

Below are classes that we can organize suspected vulnerabilities into.

  • Trapped: We detected these, we know what they are.

    • Stack Smash. A buffer overflow on the stack.

    • double-free(). free() of a pointer that was already free()d.

  • Detected: The application detected these situations, does not know if it's a security hole but it likely is.

    • Heap Corruption. Triggered by malloc() checks.

  • Untrapped: These were crashes we think may be security vulnerabilities.

    • SIGSEGV under various conditions.

      • Execute. SIGSEGV is delivered if non-executable or non-mapped memory is executed. This may indicate that the instruction pointer can be moved; an attacker may be able to gain control over program flow.

      • Write. SIGSEGV is delivered if non-writable or non-mapped memory is written to. This may indicate that an attacker can write arbitrary data to arbitrary addresses; most interestingly, an attacker may write just beyond the canary value in the stack and completely evade GccSsp.

      • With Format Strings. SIGSEGV in a function dealing with format strings may indicate an arbitrary read or write using a format string bug. We have no other form of detection of format string bugs; if we see printf() or such SIGSEGV we should be wary.

    • SIGILL under various conditions.

      • In PROT_WRITE|PROT_EXEC. SIGILL in writable, executable code may indicate altered code. If it is possible to get the program to write to arbitrary memory, code or data can be overwritten; this may indicate the possibility to evade GccSsp, see SIGSEGV when writing.

      • In non-writable text. SIGILL can be delivered when executing a code segment mapped from a program or library. This may indicate a compiler bug; or it may indicate that the instruction pointer was somehow moved and landed half-way through an instruction. The later case is similar to SIGSEGV on execute.

Grouping

It would be a good idea to group problems suspected to be similar to avoid diluting the AutomatedProblemReports with the same problem over and over and thus increase the efficient use of Ubuntu Developer time when reviewing AutomatedProblemReports. This is defined by AutomatedProblemReportsTagging, which this spec depends on; but some obvious grouping we should consider:

  • Strong: These are thought to be strongly associated

    • Stack Smash from the same function.
    • double-free() from the same calling function.

  • Weak: These may be associated but we are just guessing.

    • Failed glibc malloc() or free() checks from the same calling function.

    • Signals delivered with the same most recent functions in the backtrace.
      • In this case we could even sort by N identical recent functions.

BoF agenda and discussion

Comments

I got the idea to relate signals to vulnerabilities from PaX, which spits stuff into dmesg when you execute data or write to code. The compiler gives you an executable stack if you use a trampoline or build an assembly file without an explicit .note.GNU-stack stating you want a non-executable stack; and heap execution is not fashionable. The chances of the program legitimately triggering these kinds of things is rather low. --JohnMoser

The problem with prioritizing these is there's no "low priority" security bug. If on the face it looks to be that you can write to any arbitrary address, then on the face it looks like you can write to the stack just above the canary GccSsp uses and create a Stack Smash mimicry without really doing a Stack Smash--and without being detected. The fact that we say then for example that a Stack Smash is much more important to address than a SIGSEGV happening because somehow we tried to write to program code is a farce; a Stack Smash is just much more reliably a vulnerability and much easier to debug and fix. --JohnMoser


CategorySpec