app preview
Live Publicly Accessible Prototype

Development is frozen so not all features are up to date

Try the Prototype

Business Context

Existing Product reaching end-of-life

Voyant's existing conferencing solution was bundled as part of a white-labelled reseller instance of Broadsoft Unified Communication (UCaaS), built and distributed by Cisco. The need for a replacement was escalated by Cisco ending support for self-hosted reseller instances in favor of cloud hosted, cutting out our margins. Different components of the application had different end-of-support road maps, the first to loose support was the conferencing component named Broadsoft Collaborate

A replacement for Broadsoft was paramount to the survival of the business. Voyant's reselling of Broadsoft accounted for almost 50% of our business, or roughly $30 million in annual revenue. Given the scale of the revenue it was decided that a full replacement UCaaS solution would be built.

Customer Attrition

The existing product's conferencing experience lacked in comparison to modern competitors. Customers were were supplementing our UCaaS product with outside party's conferencing solutions. Unified Communications is sold on the premise that it's the only product a company needs. Purchasing a different conferencing solution was a crack in the foundation of our UCaaS business.


Product Requirements

Modular Product

With a tight time frame product leadership decided to take a modular approach. In the short-term the application could live as a feature inside of Broadsoft with a Long-term transition plan to live inside our new in-house UCaaS solution. However, depending on the speed of our UCaaS developers we could have skipped the short-term plan. Finally the product team was leaning towards selling the application a stand-alone product.

Built on the Jitsi Meet Open-Source Platform
WebRTC

Jitsi Meet allows us to leverage their Web Real-Time Communications (WebRTC) front-end functionality, while transmitting audio in the back-end over our own telecom network.

Videobridge

Jitsi's Videobridge functionality allows us to quickly integrate camera and screenshare capabilities.

PSTN Dial-In via the Inteliquent Network

Leveraging our parent company's telecom network allows users would be able to dial-in via a traditional phone line.


Research

Goals

Optimize the in-conference experience

With no clear context on where conferencing should live, I directed the focus of my research on the commonalities between the possible product scenarios. Conferencing applications vary in how users are organized, how users are invited to and join a conference. However, in-conference experiences are consistent across different application contexts.


Target Personas

A user group too large to target

Our conferencing product could be used by anyone who has a job.


Formative Research Methodology

In an engineering-led legacy tech company with no research experience we needed to prove value quickly. Our methods were chosen for their cost effective, data driven format.

Quantitative
Inexpensive
While establishing our user research practice, our initial budget constraints limited the methods of research that were feasible. Quantitative research can be performed unsupervised, remotely and at a large scale.
Data Driven
Our results needed to be irrefutable by leadership.
Surveys
Low Collection Cost
Of all the research methods, surveys are the cheapest. We eliminated costs by conducting them on our own employees using Google Forms, a tool we already paid for.
Ease of Analysis
Results feed into tables and then into charts. Possible correlations are thought of at the time of writing the survey.
Likert Scale
Consistency
Each question has the same format so participants can answer more spontaneously.

UX Benchmarking

Benchmark Survey

I surveyed employees about their general feelings towards conferencing applications. I also included questions that correlated to each other.

Sample Size
87
See the survey
Key Performance Indicators
During web conferences, I turn my camera on
2.9 / 5
Ability to jump into conversation in a web conference compared to in-person
24% less than in-person
Active observance of facial expressions and body language in a web conference compared to in-person
46% less than in-person
Active use of facial expressions and body language in a web conference compared to in-person
17% less than in-person

Analysis

Insight: Camera usage increases meeting engagement
Hypothesis
The use facial expressions and body language, whether consciously or not, allows participants to navigate a conversation more naturally.
UX Solution
Encourage participants to turn on their cameras to improve their engagement in the conversation
Supporting Data
Insight: Users responded to social pressure to turn their cameras on
Hypothesis
This correlation can be explained by herd mentality.
UX Solution
Same as before, encourage participants to turn on their cameras begin the positive feedback loop
Correlation
Users are more likely to turn their camera on if peers or especially their boss have their cameras on
Users who usually turn their camera on are less likely if their peers don't
Insight: Users are distracted by their own camera feed
Hypothesis
User's are concerned about their appearance
UX Solution
Reduce the size of a user's own camera feed to reduce distraction.
Distribution
Most users admitted to being distracted by their own camera feed at least sometimes

UX Solutions

Create spacial hierarchy to encourage camera use
Right Sized
Space is prioritized to show visual information
...
Inefficient
Wasted space for audio only participants
...

Layout Comparison

The ability read and project non-verbal communication across the room contributes to the ease of participation during in-person meetings.

Hypothesis
We hypothesized that the grid view would be the most effective layout to allow participants to communicate non-verbally.
Control
The active speaker view was our control because we had to design it anyway. It's an effective layout for screensharing.
Grid View
Grid
Participants can read the room and can project their intent to speak.
...
Active Speaker View
Active Speaker
It's harder to read the room if the sidebar is overflowing. It's hard to know if people can see your intent to speak.
...

Evaluation

Methodology
Unmoderated User Testing followed by the benchmark survey
Scenario
Select Grid or Active Speaker view, participate in your morning stand-up meeting. Fill out the survey. Choose the other view the following day and repeat the survey
Sample Size
38
Grid View
During web conferences, I turn my camera on
3.3 / 5
.4 change
Ability to jump into conversation in a web conference compared to in-person
21% less than in-person
3% change
Active observance of facial expressions and body language in a web conference compared to in-person
38% less than in-person
8% change
Active use of facial expressions and body language in a web conference compared to in-person
16% less than in-person
1% change
Active Speaker View
During web conferences, I turn my camera on
3.0 / 5
.1 change
Ability to jump into conversation in a web conference compared to in-person
26%
2% change
Active observance of facial expressions and body language in a web conference compared to in-person
49% less than in-person
3% change
Active use of facial expressions and body language in a web conference compared to in-person
17% less than in-person
0% change

Call Controls
Validation Using Paper Prototype User Testing
Script Scenarios
Start with your audio and camera on, what do you do with your conference?
  • Your doorbell just rang
  • Your spouse asked you a question
  • Your kids are running around behind you.
  • You're getting another call
Voyant's Solution
Comments from test users
"I like how audio and video controls are divided"
Zoom's Solution
Comments from test users
"It's not intuitive that the audio out controls are buried next to the microphone"
Hardware Controls
Situational Controls

Example

Example

Pro

  • Accounts for every situation
  • Reduces clutter
  • Have to include them anyway
  • Easy implementation for MVP
  • Scales easily as users switch between devices

Pro

  • Can provide faster controls for common situations (examples: hold, transfer)

Con

  • Can be slower due to multiple actions

Con

  • Difficult to account for every situation
  • Still need hardware controls
  • Can be cluttered
  • Doesn't scale easily across devices

Vanishing Interface

Conferencing interfaces can become very cluttered as features are added. However, the vast majority of participant's time is spent in conversation rather than using the controls. When the app detects no cursor movement it will hide the controls, allow the participant to concentrate on the conversation.

Mouse Move
In the application controls appear on Mouse Move. For demonstration purposes they appear on hover.
...

Dev Handoff

Speaking Indicator

This speaking indicator design works on both users in the video grid and users in the audio sidebar. It doesn't block anything by existing entirely in the margins between participants. It could react to volume and duration.

See the Pen Speaking Indicator by Michael Kronenberg (@mkronenberg) on CodePen.


Grid Optimization

Since CSS alone could not accomplish the task without potentially overflowing the container I created a formula to determine the most efficient arrangement

Formula Walk-through

Find the grid arrangement with the largest area per item

Larger

Smaller

If two arrangements produce the same area per item choose the arrangement with the grid ratio closest to 1

Grid Ratio: 1

Grid Ratio: 1.5

The grid arrangement may not have a negative margin to neighboring elements on the page

Positive Margin

Negative Margin

Video feeds always maintain a 16∶9 ratio

Correct Aspect Ratio

Incorrect Aspect Ratio

Variables
Browser Width & Height
Number of Participants in the Grid
Design Constraints
Use the browser dimensions to calculate the grid canvas dimensions
Canvas Width = Browser Width − Sidebar Width − (Grid Padding × 2)
Canvas Height = Browser Height − (Grid Padding × 2)

Use the number of participants to calculate the number of rows for each arrangement from 1 to 8 columns
Rows = RoundUp ( Number of Participants ÷ Columns )
Example: 8 Participants

For each arrangement, create 2 variations,
Example: 8 participants, grid arrangement 3 columns, 3 rows.
Horizontal grid fit to the grid canvas
Total Grid width = Grid Canvas width
Vertical grid fit to the grid canvas
Total Grid height = Grid Canvas height
calculate the width per feed for a horizontal grid fit to the grid canvas
Width per Feed = (Total Grid Width − (Grid Gap × (Columns − 1))) ÷ Columns
calculate the height per feed for a vertical grid fit to the grid canvas
Height per Feed = (Total Grid Height − (Grid Gap × (Rows − 1))) ÷ Rows
Use the width per feed to calculate to the height per feed and the total grid height
Height per Feed = Width per Feed × (9÷16)
Total Grid Height = (Height per Feed × Rows) + (Grid Gap × (Rows − 1))
Use the height per feed to calculate to the width per feed and the total grid width
Width per Feed = Height per Feed × (16÷9)
Total Grid Width = (Width per Feed × Columns) + (Grid Gap × (Columns − 1))
Calculate the Vertical Margin between the Grid Canvas Height and the Total Grid Height
Vertical Margin = Grid Canvas Height − Total Grid Height
Valid Layout
Calculate the margin between the Grid Canvas Width and the Total Grid Width
Horizontal Margin = Grid Canvas Width − Total Grid Width
Invalid Layout
Calculate the Area per Feed
Area per Feed = Width per Feed × Height Per Feed
Layout Eliminated
Compare all valid layouts for the largest Area per Feed
Largest Area per Feed