CleanupAudioJumble

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

The idea is to make PulseAudio the default sound system on Ubuntu, replacing the Esound Sound Daemon (esd) and ALSA dmix. PulseAudio is a drop-in replacement for Esound, but adds new features, opening it for many entirely new areas.

Rationale

Apple managed to standardize on a single powerful sound system (CoreAudio) for MacOSX which makes almost all users happy, ranging from normal day-to-day desktop users to gamers, to professional audio people. We should be able to provide the same on Linux. PulseAudio can currently provide the functionality at least partially, with the only notable exception being pro audio. PulseAudio is a modular sound server, kind of an "application server" for audio. Beyond the obvious sound mixing functionality it offers advanced audio features like "desktop bling", hot-plug support, transparent network audio, hot moving of playback streams between audio devices, separate volume adjustments for all playback or record streams, very low latency, very precise latency estimation (even over the network), a modern zero-copy memory management, a wide range of extension modules, availability for many operating systems, and compatibility with 90% of all currently available audio applications for Linux in one way or another.

In the future it is expected that PulseAudio will extend to professional audio stuff, entering JACK's current application area. This however is not relevant for the implementation of this spec, at least at this time.

Use cases

  • L. wants to play a video and a background music track at the same time, without any special setup and hassles but with lip-sync audio.
  • L. wants to transparently playback local audio on a remote machine.
  • L. wants to move the currently played back stream from the internal soundcard of his laptop to the USB headset he just plugged in, without any interruption in playback and with only minor clicking on the UI.
  • L. wants the operation described in the previous item to be done automatically by the sound system if he plugs in his USB headset.
  • L. wants to control the volume for each playback stream separately, selecting the right mixer track based on the song name.
  • L. wants to merge his two stereo sound cards into a single 4 channel surround sound card.
  • L. wants his MP3 music to always be played at half the volume but Ekiga's voice stream at the full volume level.
  • L. wants to browse for the audio devices on the network and use them much the same way he already uses the shared network printers.
  • L. wants to move the local audio stream which is played by his bedroom's computer without interruption to the computer in the kitchen.
  • L. wants to multicast audio from his laptop to all machines in his network.
  • L. wants mixed audio but still low enough latencies for voip.
  • L. wants proper audio on his LTSP thin clients.
  • L. is using an average-quality speaker set. It needs some equalization to sound right, however his audio player of choice (Rhythmbox :P) does not yet feature an EQ. He simply modifies the overall equalizer and everything that his PC plays now sounds good on his setup.

Scope

This specification changes the default sound daemon for Ubuntu. The same is immediately applicable to Xubuntu, if desired.

Design

Mode of operation

In order to provide the highest possible audio quality, use all features like hal support and dynamic stream handling, and not opening any potential attack vector, the upstream recommended mode of operation is to have a permanent pulse daemon running as the user, without automatic module unloading. The current version automatically releases the sound card if it is not used, and has hal/ConsoleKit integration, so that multiuser support works.

As long as nobody is using the sound card, OSS legacy apps continue to work.

Compatibility

PulseAudio emulates the OSS, ALSA, and esound API (amongst others), so that existing applications can be moved to Pulse without much effort. Just for the record, this emulation has nothing to do with ALSA's OSS emulation, Pulse will work even if these modules are loaded (but not used).

For fully transparent OSS emulation, we should check whether the FUSD userspace devices implementation is mature enough to replace the current LD_PRELOAD hacks.

GUI

PulseAudio offers three different graphical user interfaces for controlling audio (see a screenshot:

  1. pulseaudio volume control: Controls the volume of sinks, sources, and streams, and allows the user to move streams between sinks.
  2. pulseaudio preferences: configure network related services including multicast
  3. panel applet: notify about changes of sound hardware, choose default device, call the other pulseaudio tools.

The volume control and preferences applications are sufficient for all the use cases mentioned above. The panel applet is just "nice to have" for control freaks, so we should ship it, but not activate by default.

We will ship the pulseaudio volume control by default, since it is a very convenient interface to control the volume per stream. However, we will keep the default Gnome mixer applet (which controls the hardware mixer levels) for now.

Implementation

Code

  • The esound package is not installed by default any more, and replaced by pulseaudio-esound-compat.

  • The esound client library will still stay around to not break Gnome sound events and custom packages still relying on it.

  • Change gstreamer to prefer the pulse sink, and fall back to ALSA.
  • Change other applications to default to pulse output, if there is an available output module (xine, mplayer, libao (for Pidgin), xmms, Ekiga).

Data preservation and migration

Upgrades will be handled through ubuntu-desktop introducing a dependency to pulseaudio-esound-compat, which C/R/P: esound.

Comments

  • In order to get OSS compatibility, I would consider making ALSA-OSS modules dmix aware the cleanest hack!!! Somebody should get upstream ALSA involved.
  • On the PulseAudio website there is an interesting page about the "perfect" PulseAudio setup. ProgFou

  • Please note that the GUIs pulseaudio provides are high on geek crack, mentioning underlying libraries and tech jargon like alsa, hw:0, sinks, sources (monitor/virtual/hardware sources--what are these?), etcetera. This is not something normal users can easily understand. Rather than "alsa hw:0", brand and model of the soundcard should be displayed since they are recognizable by users. Don't include the current difficult GUIs by default please. PeterVanDenBosch

    • MartinPitt: That's also why we keep the good old Gnome mixer by default in the task bar.

  • On my system, I solved the problem with PulseAudio blocking the alsa device to other applications by connecting the audio sink and source to dmix:0 and dsnoop:0 instead of hw:0. Obviously this adds yet another layer of latency, but at least for me it isn't noticeable. On the other hand it makes legacy applications work perfectly, including 32-bit applications on amd64 which currently can't use PulseAudio via alsa because lib32asound2-plugins is missing. The dmix solution also avoids the inevitable race conditions which would occur with PulseAudio releasing the device only after a timeout. Even if the timeout is set to just one second, you will run into trouble if clicking the menu item to start a legacy alsa application also triggers a sound event. -- DanielElstner

    • In the "Perfect Setup" (see above) it is recommended to setup default ALSA devices (pcm.!default and ctl.!default) to go through PulseAudio, and of course make PulseAudio use direct references to ALSA hardware devices, to not loop between ALSA and PulseAudio. -- ProgFou

      • Yes. But connecting the sink explicitly to dmix:0 and at the same time having default routed through PulseAudio does not introduce a loop. It works perfectly here. Of course having pcm.default go through PulseAudio means that legacy applications which don't work through the alsa->PulseAudio binding cannot use the default audio device. The device plug:dmix:0 can be specified directly in order to bypass PulseAudio. But it would also be possible to set up an alias name (maybe "direct"?) for that purpose.

    • This proposal does not (and cannot) resolve instances where a knowledgeable user specifies an explicit virtual device that is not dmixed or dsnooped, e.g., hw:X{,Y} (or any extended ones like plughw:X{,Y}, plug:surroundfoo, etc.). This limitation applies to the core ALSA implementation and cannot be bypassed consistently. Arguably this concern lies outside the common desktop use cases; need more input from Ubuntu Studio considerations. - DanielTChen

      • Right. The knowledgeable user would just have to use plug:dmix:X instead. -- DanielElstner

    • Lennart Poettering's response to the same suggestion http://mail.gnome.org/archives/desktop-devel-list/2007-October/msg00150.html. Note that while latency is his main argument it is not his only argument. I think that having PA on top of dmix is not a reasonable default.

  • PulseAudio might replace esd GNOME soon. Lennart Poettering discusses it on the gnome desktop devel list. It is already shipped as default in Fedora 8. Alexandre Franke

  • I think you forgot one important use case: L. wants to play a video game without noticeable latency in effect sounds. - Because this is a bit more than lipsync video or voip.
  • Fedora folks (Lennart Poettering) have plans for completely replacing gnome-volume-control and the current mixer applet for Fedora 9. Maybe it would be a good idea to follow their lead on that... -- Eh

  • Please take care not breaking the possibility to use jack for apps using it, like Ardour. ttoine
    • Luke, Cory, Emmet, and I have discussed pasuspender usage for jackd in Ubuntu Studio 8.04. One could use a wrapper around jackd, but that's racy. It would be better to re-promote jack-audio-connection-kit into main, build the jack module for PulseAudio, and avoid the hackery. Discerning users concerned with latency will use pasuspender or avoid PulseAudio altogether. - DanielTChen

  • Will Pulse Audio be able to handle Freebob/ffado (firewire) sound cards?
    • I guess not. This is definitely a huge problem for people who own firewire sound cards and nothing else (or, like me, a very poor quality internal sound card). It's a shame no solution using Jack was studied, because I'm sure Jack would have handled everything needed by this spec. Moreover, I think we cannot reduce firewire sound card usage to "pro audio". I'm thinking of, for example, someone who just owns such a sound card to be able to record himself singing with a guitar from time to time. -- NicolasJoyard

      • Umm, Jack *was* examined. However, it is not a general-purpose desktop audio solution. Anyway, if jack can find and drive these firewire sound "cards", then so can pulse — as soon as somebody writes the support for it (or, probably somewhat easier, ports it from jack). -- smurf
        • OK, I guess I'll have to look into porting ffado jack backend to pulseaudio Smile :) Do you have any links to why jack "is not a general-purpose desktop audio solution" ? Just for my personal culture... -- NicolasJoyard

        • Jack is more intended to audio processing and recording : real time, it need patching apps between them and the i/o of the sound card, and it is not very easy to create a "standard" setup for it, so I guess it some points like that wich oriented the devel to Pulse Audio. Please notice that jack support for Pulse Audio can be added just by compiling and adding the pulse audio to jack or jack to pulse audio plugins, please see source of pulse audio for that (I can do it myself, I don't know how to package). The great thing by adding this support would that all non-jack applications could be used in jack. And so, it would be possible to use a firewire sound card supported by ffado/freebob with Pulse Audio through jack. I think that one who buy a firewire sound card will want to use jack a minimum to record itself, as somebody told. I would be glad to test and provide hardware if necessary. ttoine
        • There were a few salient points made, two or three developer meetings ago, which I don't remember well enough at this time. Somebody who does remember should probably add them to this spec's rationale. -- smurf
  • There is an issue with MIDI playback, Currently timidity can be used to as alsa sequencer client, so applications can play midi files, but if PulseAudio is enabled timidity can't access sound card. (timitidy runs as a daemon before users logs in as root)

    • MartinPitt: that's something that needs to be fixed in timidity then: it should not keep a permanent connection to the sound card (Pulse doesn't), and/or it should use Pulse's ALSA/OSS emulation layer and work through pulse.

    • Timidity doesn't bind sound cards at boot. It registers ALSA MIDI ports at boot, but only binds to audio output ports during audio generation.
      • The problem is default sink in alsa config. If I set pulse, pulse is not available at boot and if I set dmix then even if pulseaudio leave the connection ( which doesn't in gutsy ) when a timidity starts playback there is no sound on system. Is there anyway to start timitidy and configure it to use pulse even if it's not available when it is starting? --SorooshRadpoor

    • I think pulseaudio needs to be run as an ordinary user to be able to use a user's pulseaudio connection. Maybe we shouldn't run timidity on boot but on desktop login (session). I'm using esound output (timidity -iA -Oe) to route timidity through pulseaudio so I can play MIDI and audio together. -- WvEngen

  • What about Open Sound System (OSS) 4.0? It is finally GPLed. See http://www.opensound.com/ -- AzraelNightwalker 2007-12-01 15:15:30

    • MartinPitt: Pulse provides an API, mixer, and desktop management tools, whereas OSS provides kernel-level sound drivers; thus OSS is an alternative for ALSA, not for Pulse.

      • AzraelNightwalker: Ok, so it means that implementing OSS4 in Ubuntu would require another spec.

        • Currently eliminating buffoonery in the alsa-driver source package so that users can drop in OSSv4 - DanielTChen

    • jonaseberle: Saving and restoring different sound profiles would be of great use to lots of users.
  • OliverGrawert Please be aware of the breakage flash exposes with networked pulse connnections http://www.pulseaudio.org/ticket/43 there is a library that works around the issue (http://pulseaudio.revolutionlinux.com/PulseAudio explains a bit more). sadly that lib is linked against libssl as well as against lgpl libs which makes it undistributable for us. Daniel Chen worked on a tls port of that lib, code can be found under https://code.edge.launchpad.net/libflashsupport-pulse .

    • This package (libflashsupport) is available in hardy/universe as of Alpha 3. - DanielTChen

  • There are a few new suggestions in launchpad that I believe should be integrated to this project IF they are not already. It would be nice to have a simple way to manage multiple sound cards. For example, if I were to plug in a pcmcia or usb sound card to my laptop, it would automatically become the default card. At the present time, even if you select the right drivers, not all softwares will use the same sound card. Using two sound cards is not an uncommon thing since motherboards always have one and then if you want to add a better one (for example, Creative sound cards), you then have a second. Here are some other launchpad links discussing similar problems:
  • Need to be careful in Implementation ("Change other applications to default to pulse output, if there is an available output module (xine, mplayer, libao (for Pidgin), xmms, Ekiga)"). Setting the source package for, say, libao to pulse is straightforward, but it will likely break Kubuntu (which, AFAIK, will continue to use ALSA via arts). - DanielTChen

    • AFAIK KDE4 removes arts and introduces a new audio mechanism which should allow pulse, so Kubuntu should be fine. -- Zameer Manji
  • Oh... the PulseAudio GUI looks like it just adds another layer of complexity to the already-unintelligible UI: https://bugs.launchpad.net/ubuntu/+source/gnome-alsamixer/+bug/187848. Please don't do anything more to sound without addressing usability and understandability! Thanks - DaveAbrahams

  • You guys at Ubuntu and the Fedora peeps have together created a situation excruciatingly bad and a simple search of support forums of many distros will show you just HOW BAD THE PULSEAUDIO SITUATION IS and by extension sound in Ubuntu, Fedora, et al is! This is not a lightweight matter, and I hope to god there is some serious accountability within both Ubuntu and Fedora for this horrific decision. (At least Fedora has an excuse seeing as they knowingly trial run stuff such as SELinux as a matter of policy.)

    It really doesn't matter how good PulseAudio "could" be, what we have right now is utterly broken. It's as if someone said, "there's a mad cow problem in Britain so let's nuke it to clean it up!" Just reading stupid things above saying things to effect of, "we had some good reasons but we can't remember them", and "well if Jack does firewire soundcards then PulseAudio could write that support in too" makes me shake my had in sadness at the poorly disciplined software practices occurring.

    Here's a tip: next time you're planning a release, make sure to tell everyone you're including ALPHA sound support because that's what you've got. "It works for me" doesn't cut if when the support forums are full of despair. And don't you dare blame closed source systems like Flash. You are responsible to ensure that Ubuntu works with the Linux version of a product on your distro, not them. Disgusting. I was despairing at the general situation before I read this page, now I'm just angry. - Marcus238

  • For anyone reaching this page after struggling with the mess that is Ubuntu sound and PulseAudio, have a look at the competition. Not only do they tend to "just work", they also have some really great GUI control apps that are much clearer in how they work for complex audio inputs and sinks, but they ""have more features!!!"".

    Instead of reading the long PulseAudio "Perfect Setup" page which is basically a long list of how to change how everything else works to fit PulseAudio (they've got some crazy nerve those douches at PulseAudio, don't they?), you could just install something that's been around for awhile and actually works and makes sense. I struggle to understand exactly how your target Ubuntu user is going to ever make sense of the complex applications and language in the supposedly user-friendly PulseAudio apps.

    PulseAudio dudes: Just calling it "user friendly" doesn't make it so. I don't know how you guys got away with your scam but ""other projects actually have to be user friendly to claim that"". (I've used Ubuntu for three years and created an account to comment for the first time ever because situation is just that bad.) - Marcus238

  • Ohhhhhh, I just got it! I'll be you guys did this because the most commonly used GUI userspace apps for JACK are written with Qt (that's Qt, not KDE). I'm gonna guess you guys don't like Qt in regular Ubuntu? (Nevermind that Gtk is standard part of any Qt dominated install, sigh at the stupid turf wars.) Is that it? Did it seem easier to try to make an alpha quality mess of system software work rather than port Qt GUI to Gtk? This despite the confusing and Unix/Linux jargon-heavy PulseAudio GUI apps? Maybe a preference for Unix/Linux jargon and complexity rather than problem domain audio jargon? Sorry about third comment, but it just dawned on me that this is probably the result of some bullshit fatal flaw raised over Qt vs Gtk. Perhaps? (Not that you'd admit it, I suppose.)

    I should mention that until a user can watch YouTube for 365 days in a row on an Ubuntu install undergoing normal cycles of application installs, removals, cycles, plus use of some new devices attached, without ruining the "YouTube" experience, you're not going to find your "Desktop" distro anywhere near my teenage daughters' computers. In case you didn't realize it, the desktop test is, "does it 'run' YouTube" without a single brush with jargon (unless you consider "Volume" to be jargon). Audio is the fatal flaw of Ubuntu and Linux in 2008, and you just pushed us back a couple years with the PulseAudio decision in Hardy.

    And yes, this is my last comment. In case you're wondering if this warrants such vitriol, do some Google searches: "pulseaudio problem ubuntu" returns 537,000. Congratulations. - Marcus238


CategorySpec

DesktopTeam/Specs/CleanupAudioJumble (last edited 2008-08-06 16:22:41 by localhost)