Firefox Mobile Concept Video

Firefox is coming to mobile. The innovation, usability, and extensibility that has propelled Firefox to 200 million users is set to do the same for Firefox in a mobile setting.

User experience is the most important aspect of having a compelling mobile product. Every bit of interaction and pixel of presentation counts when typing is laborious and screen sizes are minuscule. Many of the standard interaction models, like menus, always-present chrome, and having a cursor, don’t necessarily make sense on mobile. It’s a wickedly exciting opportunity but there are myriad challenges to getting it right.

One avenue for exploring this opportunity is through Mozilla Labs, which is about pushing the envelope towards better and brighter interaction horizons, as well as incorporating a winder community into the innovation process. This concept video explains one direction we are thinking in, and we’d love to inspire participation in thinking about other directions.

Standard Mockup/Experimental UI Disclaimer
All of the images and videos are only conceptual mockups of an experimental UI for Firefox Mobile, any particular feature may end up looking entirely different, or may not even make it into the final release.

Screencast

Design Principles

Touch. This concept prototype for Firefox mobile (code name Fennec) is being designed for a touch screen. Why not multitouch? Because Firefox should be able to run on the least common denominator of touch devices. Especially for touch-enabled interfaces direct manipulation is key. Along that line of thought, the interface should be operable with a finger. Switching between input methods is time-consuming and annoying, so the user shouldn’t have to switch to a stylus or other secondary form of input. Firefox will work on non-touchscreen devices, but that’s out of scope for this demo.

Large targets are good. The same fingertip that controls the interface takes up between 1/5th to 1/10th of the vertical/horizontal height/width of the mobile touch-screen. In other words, fingers are fat: hitting small targets is like trying to touch-type with your elbow. All actions should be represented by targets that are large enough to be fast, easy, and (at the very least) not aggravating to hit.

Visual Momentum and Physics are compelling. Nothing shouts “sexy!” like pretty animations and a physics engine. Beyond marketing appeal, there is a strong argument that such physicality helps the user build a mental model of the interface, and that interface physics yields consistency. We are wired to track the movement of things and to be able to remember where they’ve gone, as long as they don’t appear and disappear, which doesn’t happen in the real world. Of course, copying every physical metaphors blindly gets you interfaces like the multi-million dollar blunder that was Microsoft Bob, so we need to select our metaphors carefully.

Typing is difficult. This means we want to minimize the amount of keystrokes required to get anywhere or do anything.

Content is king. With restricted screen size, every pixel counts. As much of the screen as possible should always be dedicated to content, not controls or cruft.

Opening Screen

Let’s start with the basics. When you first open the browser, you’ll see two things. Your bookmarks (right), and a big plus button (left) which opens a new tab/page.

Clicking on bookmarks zooms the browser to the page. You can scroll the page by dragging as well as flicking. Scrolling has momentum (like the iPhone), so it’s easy to go long distances.

You zoom out again by dragging the page past its border in any direction. One of the cool things about this is that the gesture for zooming out to the home screen is simply throwing the page in any direction. It’s singularly therapeutic! It’s also discoverable: in the panning process you discover the visual clues that there is more past the edge of the page. In informal user tests every user figured out the interaction without instructions or help.

New Tabs

Creating a new tab is easy. You click on the big plus, and the browser finds an open spot, puts the tab there, and zooms in on it. If you have a homepage, the new tab opens there. Even as the page zooms further out, the new tab buttons remains in the same logical spot.

Placing the cursor in the URL immediately gives a set of results (a la the awesome bar) that is generate from your history sorted by both frequency and recency. To do an immediate search on what you’ve typed, you can just tap a search provider at the bottom of the results.

Controls

The standard controls (like back and forward) are located to the left of the page. You get there by gently dragging the page in the appropriate direction. When the controls are visible, the URL bar fades in as well. This makes for a discoverable way to access the chrome, with a huge Fitts’s-law-abiding target. To get rid of the chrome, you just drag the page back to the center, and the controls slide away.

Because the controls are accessed by panning, for the most part every single pixel of the screen is filled with what you care about most: the content.

By using horizontal panning to access the controls, we avoid the iPhone’s problem of needing to go to the very top of the page to enter URLs. (Yes, I know about tapping on the top of the screen to auto-scroll, but few iPhone users even here at Mozilla knew the trick). Horizontal panning does introduce the problem of accessing the controls when the page requires a lot of horizontal scrolling, but this is a rare case, and with kinetic scrolling moving a long distance is fast. Even long left-right pages will require only one flick to get to the edge, and then one push to open the controls.

To increase the discoverability of the controls even further, it might make sense to show the controls in the zoom-out view as well (appropriately scaled with the page).

Spatial View

In the zoomed-out view, you can drag the pages around to group them in a way that makes sense to you. If you have a couple pages open to your email, and a couple pages open to some vacation planning material, you can place the similarly-themed sites physically together. This makes excellent use of our spatial memory and muscle memory: finding where you put a page on two-dimensional plane is remarkably swift and requires little cognitive load. With the addition of semantic placement, it’s even better. Spatial view uses space in a near optimal fashion—all pages are displayed at once in near maximum size so that extraneous interaction is not required to browse through your open pages.

This is in contrast with the standard list-based, pop-up method of choosing pages, which is fast for a small number of items but becomes increasingly cumbersome as you have to peck-hunt-and-scroll for the page you want. Such interfaces retain no sense of physicality or visual momentum, and thus lose the benefits they confer.

One of the big benefits of the Spatial view is that it is easily extensible. The bookmarks page is a good example, but think bigger. Imagine if I wanted to transfer info from one cellphone to another: I’d bring the the new phone close to mine, and it would appear as an area in the spatial view. I could zoom into it (assuming I had the appropriate permissions) and see what the other phone is looking at, collaborate, and drag tabs/files/contacts back to my phone. I’m sure there are other ways of using the metaphor, too.

There are a couple ways of augmenting this view. For example, the pages can be sized based on the frequency/recency with which you view a site. That way, sites you visit more often will be larger targets. It also helps to combat the all-page-thumbnails-look-like-white-squares problem. The spatial view has some potential scaling problems on the desktop where you can have upwards of 100 tabs open, but on mobile you’ll never have more than 10ish tabs open. Even so, I think there are solutions to the desktop problem, but that will have to wait until another post.

Spatial view has some passing similarities to Apple’s Expose. Both mechanisms are examples of zooming user interfaces (ZUIs), which have been around for almost as long as multitouch. The biggest different is that with Expose, the spatial relationship between windows is broken every time the windows reshuffle. With the spatial view I know that the tab I want is down and to the left, whereas that mapping doesn’t exist using Expose. It’s an important distinction that helps the user to feel comfortable in the space and not get lost.

Get Involved

The code for the demo is open-source and available here. You can even play with it, but be warned that it is a hack-up-prototype and there are lots of edge-cases that aren’t handled. Labs is interested in turning this concept into a working Firefox extension, so if you are interested in helping to spearhead that effort, drop me a line (my first name at mozilla.com). If you’d like to get involved with Mozilla Labs more generally, check out our forums.

If you’d like to get involved on a broader level with the Mozilla mobile effort, you can join us on IRC (either #labs or #mobile on irc.mozilla.org), join the mailing list, or take a look at the Mobile wiki page.

Thanks

Creativity thrives best when not in a vacuum. There are a lot of folks I’ve bounced ideas off of, but I wanted to call out special thanks to crazy-in-the-best-way-possible Mark Finkle, drier-wit-than-thou Madhava Enros, unsung-hero Alex Faaborg, and I’ll-get-you-next-time Jenny Boriss for their invaluable input.