Sunday, April 19, 2020

A Proposal for a Flexible, Composable, Libre Desktop Environment

Note: I posted a better-formatted version of this document here as a PDF file.  Unfortunately there are some formatting glitches with Blogger's HTML editor.

Disclaimers:

  1. This is not an official project. This document describes my thoughts about a desktop environment intended for Unix-based operating systems that is libre (i.e., free software per the definition of the Free Software Foundation), is composable (where users can create command-line and GUI tools by connecting smaller tools together), and is flexible (where software tools do not impose a particular user interface, allowing the user to modify the UI of the tool to best suit the user's preferences). Whether or not I will work on it is something I still need to consider, but I'm sharing my thoughts for feedback to see if pursuing this as a side project is worthwhile.
  2. I will be expressing many opinions in this document. In the words of LeVar Burton, "You don't have to take my word for it."

Problems with Today's Desktop Environments and Applications

  1. Smartphone- and tablet-based UI/UX metaphors have been inappropriately applied to some desktop environments, resulting in a loss of usability compared to the desktop environments of the 2000s. This is especially apparent in Windows 8, GNOME 3, and (to a lesser extent) Windows 10. When Apple introduced the iPhone and iPad in 2007 and 2010, respectively, there was much talk in the personal computing world about mobile computing replacing desktop computing. The developers of Windows and GNOME were heavily influenced by this thinking, and they sought to develop versions of their desktops that aimed to be suitable for both desktop and mobile computing. Now, I must commend the developers of Windows and GNOME for taking risks. Windows 7 and GNOME 2 were well-received by many people, and it was a gamble changing these environments. The results were Windows 8 and GNOME 3. While these environments were well-received by mobile users, some desktop users were disappointed, feeling that the user experience was a downgrade from Windows 7 and GNOME 2. For Windows this led to people refusing to upgrade from Windows 7, and for GNOME this led to the fracture of the GNOME-based desktop community into GNOME 3, MATE, and Cinnamon, with both MATE and Cinnamon aiming to serve those alienated by GNOME 3’s changes. I believe the lesson in this is that developers of desktop environments should respect the fact that desktop computing has fundamentally different use cases than mobile computing, and trying to create a common interface winds up in misapplying UI/UX metaphors.
  2. The UI/UX design fads of the 2010s, including “flat design” and the gratuitous use of screen space, are a usability regression from the desktops of the 1990s and the 2000s. Consider the Windows 95 and Mac OS 7.5 interfaces. It is largely clear to see which elements are clickable and which ones are not. This held true as late as the late 2000s with Windows 7 and Mac OS X 10.6 Snow Leopard. Contrast that with the flat interfaces of a lot of software products today where it’s much harder to visually determine which elements are clickable and which ones are not. It’s not just flat design that’s problematic; there are other design decisions I disagree with. In macOS, what were once easily-visible scroll bars that were colored in bright blue have been replaced with thin, gray scroll bars that are harder to use, with the assumption that we’ll be using our mouse’s scrollwheel or our laptop’s touchpad’s scroll gestures instead of the actual scroll bar. In Windows 10, the title bars look excessively large relative to the menu bar (the reason for the large bars is to be able to move the window in a touchscreen interface; this is an example of a design decision that would be appropriate for mobile computing but is unnecessary in desktop computing), and its windows in new-style programs often consume large amounts of whitespace. I would love to be able to switch to Classic mode (i.e., a Windows 2000-style interface) in Windows 10; on Windows I feel most productive in Classic mode. There’s just one problem….
  3. Modern desktop environments and applications are increasingly curtailing the ability for users to control the appearance of their desktop environment and their applications. For Mac users this is not a new development. Ever since the transition from Mac OS 9 to Mac OS X in 2001, Apple has not provided mechanisms for users to apply themes that are different from the Mac OS X Aqua interface. Windows, however, used to support many modifications to its default themes. This changed in Windows 10 when it became more difficult to theme the desktop environment. In 2019 some GNOME developers wrote an open letter urging Linux distributions to not apply custom themes to their applications. Here is a key excerpt from the letter:
    “On a platform level, we believe GTK should stop forcing a single stylesheet on all apps by default [emphasis original]. Instead of apps having to opt out of this by hardcoding a stylesheet, they should use the platform stylesheet unless they opt in to something else. We realize this is a complicated issue, but assuming every app works with every stylesheet is a bad default.”
    Although the signatories of the open letter have explicitly stated that they are not opposed to end-users “tinkering” with the style of their applications, I feel that their suggestion to require GTK applications to explicitly opt into theming will, if implemented, make it more difficult for users to apply themes to their desktop environments and applications.
  4. Many desktop environments and applications lack the ability for users to customize the UI based on their needs and preferences. UI/UX decisions are a major cause of complaints about software. In some situations users respond by rejecting that software, instead seeking out alternatives. In other situations, though, sometimes the user doesn’t have a choice, but instead must learn how to cope with the UI.
    But what if users had another choice? What if users were able to modify the UI of their software as they saw fit without resorting to modifying its source code? For users who are not comfortable with adjusting their UI settings, what if they could download UI configurations from a repository of user-submitted configurations, including from UI experts who ran formal usability tests? This would increase user satisfaction with software products, since users won’t feel that they have to accept the UI decisions that were made by the product’s developers and designers.Microsoft took a step in the right direction by allowing its ribbon in Microsoft Office to be user-modifyable. When the ribbon was introduced in Office 2007, it had few configuration options, and it was controversial among long-time Office users, particularly since there was no way to switch back to the menu-and-toolbar-based interface of Microsoft Office 2003. However, while later versions of Microsoft Office still do not provide a means to return to menus and toolbars, the ribbon has been made to be much more customizable.

Monolithic Applications versus Composable Tools

Contemporary desktop environments promote the use of monolithic applications where the application itself is expected to provide the functionality that users need in order to perform a task. Often these applications are “silos,” where they tend to not interact well with each other unless they are part of a common suite of applications such as Microsoft Office and the Adobe Creative Suite. While some of these applications may provide internal scripting support (such as Microsoft Visual Basic for Applications), most applications don’t provide external scripting support (e.g., the ability for a bash script or a Python program to be able to access Adobe Photoshop’s image cropping functionality in order to programmatically crop images).
I contrast this with the traditional Unix approach of combining small tools to perform large tasks. While there are many tenets of the Unix philosophy, there are three tenets that I will emphasize the most:
  • There is no distinction between user and programmer.
  • Programs should do only one thing, and do them well.
  • Users are encouraged to combine small tools into larger tools using mechanisms such as pipes, I/O redirection, and shell scripting instead of developing large, monolithic applications that perform multiple tasks.

This philosophy is expressed and taught in the 1984 book The Unix Programming Environment by Brian Kernighan and Rob Pike, Bell Labs researchers who have played a major role in the development of Unix.
The idea of a user environment as a suite of composable tools is not inherently limited to command-line environments. OpenDoc was a project spearheaded by Apple, IBM, and other companies in the mid-1990s that encouraged software vendors to develop and sell components than can be combined by users and other developers to create larger solutions that were either in the form of a document or a larger application. The business goal was to challenge the dominance of large, monolithic applications by creating an ecosystem of smaller, composable utilities, allowing for more software companies to be able to compete in the software marketplace and also providing users and developers increased flexibility in their workflows. These components would run on the classic Mac OS, Apple’s eventually-cancelled Copland project, IBM OS/2, and other supported operating systems. Unfortunately, other than the influential Cyberdog web browser and a small handful of other OpenDoc components, OpenDoc did not last very long in the marketplace, and its impact was limited. OpenDoc’s development stopped in 1997 when Apple cut many engineering projects in order to focus on adapting the technology from the newly-acquired NeXT to its operating system strategy, which ultimately led to the release of Mac OS X 10.0 in March 2001.
Despite this setback, I believe OpenDoc was a victim of Apple’s circumstances, and I believe the ideas of OpenDoc should be re-explored for today’s desktop software.

Composable Tools Are Objects

One important key to building composable tools that work in programmatic, command-line, and GUI environments is using objects. OpenDoc was a C++ API backed by IBM’s System Object Model. However, there are rich dynamic object models that we can explore as alternatives, including Smalltalk’s derivatives such as Squeak and Pharo, Objective-C (which was heavily influenced by Smalltalk), and the Common Lisp Object System. By using dynamic objects as the foundation for software components, we can overcome the limitations of Unix’s pipeline approach to program composition, which relies on the weak link of parsing streams of text, and we can take advantage of the flexibility that dynamic objects provide as opposed to static objects. In fact, we can think of Unix utilities as objects with a run() method that accepts the utility’s command-line arguments and outputs a string. By explicitly expressing tools as objects, we allow for a much wider range of inputs and outputs that are not limited to text streams, resulting in a richer experience.

Separating UI from Core Functionality

A very important design tenet when developing composable tools is to separate the tool’s user interface from the tool’s core functionality. By not tightly coupling the UI with the underlying functionality, it is easier for the tool to be used under a variety of circumstances, whether those circumstances are (1) being invoked as an API call, (2) being run as a command line utility, (3) being run as a desktop GUI application, or even other circumstances.
As an example, suppose I want to provide a component that supplies a calendar. The object implementing the core functionality would supply methods such as retrieving the days of the week, figuring out whether the current year is a leap year, computing the numbers of days between two dates, storing events into a provided database, exporting to iCalendar format, etc. The Unix cal utility and GNOME Evolution can be rewritten to use the calendar component. This is how the same core calendar functionality can be used in a variety of settings, with the benefits of being able to use the same functionality without being restricted to a particular application.
This is also a guideline for converting existing software. Suppose GIMP’s core functionality were separated from its interface. This will allow for the creation of programs and command-line utilities to leverage GIMP’s image manipulation features without having to open the GIMP GUI application. This will also allow for the easier development of alternative UIs for GIMP, especially when combined with the ability for users to be able to customize the UI themselves.

UI Customizability and Themability

How will users be able to customize the GUI? I envision this to be a combination of two technologies:
  1. OpenDoc’s ability to merge components in a free-form style as part of a visual container structure known as a Document.
  2. With the exception of Microsoft Office 2007, since at least Microsoft Office 97 there has been extensive support for users to be able to change menus, toolbars, and/or the ribbon as they see fit.
How is this exposed programmatically? Each GUI component exports methods that implement some type of command. For example, a text editor would have commands corresponding to “Save File,” “Find/Replace,” “Delete Specified Lines,” etc. These commands would correspond to either menu items or toolbar buttons. Users can then modify how menus look, whether to use icons or words to describe commands in toolbars, whether to have a horizontal toolbar or a vertical one, etc.
All UI elements would be implemented under a common framework, and the framework chosen or developed will allow theming.

Not a New Operating System

For some time I’ve thought of the idea of creating either a Smalltalk-based operating system (e.g., imagine Pharo running on bare metal instead of as a siloed VM) or a Lisp operating system influenced by Symbolics Genera. The underpinnings of these systems would make an excellent base for implementing the ideas discussed in this document.
However, there are two main challenges with this approach of creating a new operating system:
  1. A new operating system will lack device drivers. One of the things that hinder the development of non-Linux libre operating systems such as Plan 9, Haiku, and ReactOS is their relatively limited driver support. It will be a major effort writing device drivers for a new operating system.
  2. There is also the “chicken-and-egg” problem of switching to a new operating system. Without certain key components such as a web browser (which will require porting Firefox or Chromium, a large effort, or creating a new web browser, which is an even larger effort), it would be hard to convince people to switch to the new operating system. But if the operating system has few users, then developers would be less likely to develop for it.
As much as I’d love to use a Smalltalk or Lisp operating system, I believe that instead of a new operating system, I think the best approach would be to leverage the libre GNU/Linux/BSD ecosystem and to build on top of it. This solves both the problems of device problems and the “chicken-and-egg” problem. Users can still run existing applications and use existing tools side-by-side with new components. These new components can even leverage the same GUI toolkits of existing applications in order to ensure overall system consistency.

Decisions to Make: Object Systems and GUI Frameworks

Two core decisions I’m considering are the object system and the GUI framework. For the object system I am enamored by the powerful Common Lisp Object System and it would allow for Common Lisp’s impressive live debugging features, but Objective-C is appealing due to its ability to call C and C++ code without the use of any wrappers. For the GUI framework I am partial to GNUstep due to its Objective-C foundation, which supports dynamic dispatch and would thus make it easier to implement component-based systems. Using GNUstep also has the side bonus of being able to bring macOS users into the fold and by providing native support for macOS.

Conclusion

Today’s desktop environments and applications suffer from a lack of flexibility, a lack of customizability, and a lack of composability among applications. This document proposed a new type of desktop environment and approach to application development that emphasizes components, objects that can be composed in ways that are even more powerful than Unix’s composable command-line tools. This document also advocates the development of GUI components that can be themed and also have malleable user interfaces. This new desktop environment will be built on existing Unix-like operating systems such as Linux and BSD.

2 comments:

  1. Have you looked at Arcan? (arcan-fe.com). I think it would be an excellent base for what you are proposing. The Pipeworld project that letoram built on top of it kind of shares some of these ideas of composability. There are videos on that blog

    There's also a divergent desktops article, where he mentions compositing as well: https://www.divergent-desktop.org/blog/2020/08/10/principles-overview/#p11

    ReplyDelete