Implementing cursor sharing with Daily's video call API

Introduction

I know exactly what you're thinking: "Liza. How do I build a cursor sharing feature in my application with nothing but vanilla JavaScript and Daily?"

Good news: you're about to find out.

In this post, we'll go through how you can integrate cursor sharing features with Daily: so that your users can share not just their beautiful faces, but their beautiful cursors as well.

What we're building

We're building a small demo which embeds a Daily video call on the left and shows some content for call participants to review together on the right. When participants position their mouse over the content area, other call participants can see their cursor position on their own screens.

We'll be implementing all of this with vanilla JavaScript, CSS, and daily-js. No animation libraries, server components, or other fanciness required.

For this demo, we're using Daily Prebuilt, but you can use the same kind of implementation in an app utilizing our fully customizable Client SDK as well.

Animation of a remote participant's cursor being shared in page content
A remote participant's cursor being shared in page content

Getting started

First, you'll need to create a free Daily account if you don't already have one. Then, create a room through your Daily dashboard.

Clone and run our demo repository by running the following commands in your terminal:

git@github.com:daily-demos/cursor-sharing.git
cd cursor-sharing
git checkout v1.0.0
npm i
npm run start

Now, note the address and port shown in your terminal and navigate to it in two different browsers (usually this will be 127.0.0.1:8080) and paste in the URL of the Daily room you just created into the join form in both browser windows.

After joining the call, move your mouse around in the content area on the right-hand side of the video call. You should see a remote cursor moving in the other browser instance you've opened up:

Cursor sharing between two browser instances
Cursor sharing between two browser instances

Now that you can see first-hand what we're creating, let's dig into the implementation.

The power of Daily's "app-message" events

Daily's "app-message" events enable call participants to send data to other participants. The local client can send messages to everyone else in the call, or just specific users based on their session IDs. For our cursor-sharing implementation, the data we're sending will be cursor coordinates, and it will be sent to all other participants.

Here's what our "app-message" payload looks like:

{
  'type': 'cursorPos',
  'x': 0,
  'y': 0
}

The x and y values above will be replaced with the user's actual cursor coordinates in relation to the <div> element that displays our text (also known as our "content" div.)

Fun fact: In peer-to-peer mode, Daily’s "app-message" events are sent over a WebRTC data channel. In SFU mode, they are sent over a websocket.

Performance and space considerations

We're going to try to be conservative with how much data we send and process by only broadcasting a local user's cursor position to all other participants after the cursor has paused for 100 milliseconds. Every application is different, and you might want to play around with this value to decide what's appropriate for your use case. It's a balance between keeping the cursor data up-to-date while at the same time ensuring a manageable flow of data for processing.

We've also had to account for common cases of DOM discrepancies between browsers and devices. A div that is 500px high in one browser may be slightly larger or smaller in another, making it difficult to position remote cursors in a way that accurately represents their location in relation to other content elements.

Varying default line heights seem to be a common cause for this kind of discrepancy. These mismatches in element size can trip up our display of remote cursors: if a user is hovering over a particular word, we want remote users to see their cursor over exactly that word.

In our demo, we've taken a few precautionary measures to try to ensure consistency of cursor positioning:

  • Fixed size page layout: no fancy responsive layouts here
  • Pixel units for both text and line height inside our content DOM element. Both em and pt values resulted in slightly different dimensions across browsers, so we went with pixels for uniformity

Application entry point

The entry point of our application is index.html, which sets up the DOM and imports our primary JavaScript entry point: index.js.

As you probably saw when you ran the demo above, the first thing the user sees when they load the app is a join form which accepts a Daily room URL:

Cursor sharing demo call join form
Cursor sharing demo call join form

In index.js, we'll set up the handler for our call entry form:

window.addEventListener('DOMContentLoaded', () => {
  // Set up entry
  const form = document.getElementById('enterCall');
  form.onsubmit = (ev) => {
    ev.preventDefault();
    const roomInput = document.getElementById('roomURL');
    const roomURL = roomInput.value;
    joinCall(roomURL);
    showCall();
  };
});

Above, we retrieve the entry form and set up its submission handler to obtain the room URL input value, join the call, and show the call UI (thereby hiding the entry form). Let's look at the most interesting part of the operation: joining the call.

Joining the video call

Joining the video call involves the following:

  • Instantiating a Daily Prebuilt iframe
  • Configuring Daily event handlers
  • Calling join() on the Daily iframe

Let's go through each of these steps.

Instantiating a Daily Prebuilt iframe

The first step to joining a Daily Prebuilt call is creating a Prebuilt iframe. This iframe is appended to whatever parent element you choose to provide–in this case, our container div:

export default function joinCall(roomURL) {
  const container = document.getElementById('call');

  const callFrame = window.DailyIframe.createFrame(container, {
    showLeaveButton: true,
    activeSpeakerMode: false,
  })
  // ...The rest of the function here

The createFrame() factory method instantiates the iframe with given settings. Here, I'm instructing Daily to show a Leave button to the user and disable active speaker mode. When disabled, all video call participants are shown in grid mode by default, which means all participants' video tiles get the same amount of space in the call UI regardless of who is speaking.

After instantiating the Daily Prebuilt iframe, the next step is to define the event handler we'll need to enable our cursor sharing feature.

Configuring Daily event handlers

The Daily events we'll work with for this demo are:

Here's how the handlers for the above events are configured once the Daily iframe is instantiated. I've left some explanatory comments in-line, but we're going to go through the details shortly.

    callFrame.on('joined-meeting', () => {
      // Define the code to execute when the cursor
      // listener detects relevant mouse movement
      const callback = (x, y) => {
        callFrame.sendAppMessage({
          type: cursorUpdateMessageName,
          x,
          y,
        });
      };
      
      // Start monitoring local mouse movement
      startCursorListener(callback);
    })
    .on('app-message', (e) => {      
      // We're going to parse the message contents
      // and update remote cursors in this function
      handleAppMessage(callFrame, e);
    })
    .on('left-meeting', () => {
      // Remove all cursor divs and show the call join form
      removeAllCursors();
      stopCursorListener();
      showEntry();
    })
    .on('participant-left', (e) => {
      // Remove remote departed participant's cursor
      removeCursor(e.participant.session_id);
    });

Let's start at the top: beginning the process of monitoring the local participant's mouse position when they join the Daily video call and broadcasting that information to other participants

Making a join() call

As the final step in our video call setup, we make a call to Daily's join() instance method with the room URL retrieved from the join form:

  callFrame.join({ url: roomURL });

This begins the process of joining the Daily video call, which we know is successful when we get a "joined-meeting" event from Daily.

Monitoring and broadcasting mouse position

Earlier on, we created a "joined-meeting" event handler. Here it is again:

    callFrame.on('joined-meeting', (_) => {
      // Define the code to execute when the cursor
      // listener detects relevant mouse movement
      const callback = (x, y) => {
        callFrame.sendAppMessage({
          type: cursorUpdateMessageName,
          x,
          y,
        });
      };
      
      // Start monitoring local mouse movement
      startCursorListener(callback);
    })

Above, you'll note that we call a function called startCursorListener() and give it a callback. The callback is the code that will be run when the cursor listener decides it's time to send cursor position data to other video call participants. Let's take a closer look at how this is implemented.

startCursorListener() is defined in the cursor.js file in our demo repository:

let mouseStopTimeout;
let contentDiv;

export function startCursorListener(callback) {
  contentDiv = document.getElementById('content');
  const scrollDiv = document.getElementById('scroll');
  scrollElement.onmousemove = (e) => {
    clearTimeout(mouseStopTimeout);

    mouseStopTimeout = setTimeout((_) => {
      sendData(e, callback);
    }, 100);
  };
}

Above when the process of monitoring mouse movements is initialized, I also set the contentDiv variable which will be used throughout the rest of the cursor code. This content div is the one that holds the text which participants will be reading.

Next, I retrieve the scrollDiv element, which is a scrollable wrapper around the content element.

In index.html, the content and scroll divs are defined as follows:

          <div id="scroll">
            <div id="content">
              <h2>The Raven</h2>
              <h3>By Edgar Allan Poe</h3>
              <!-- The rest of the poem here -->
            </div>
          </div>

After retrieving the scrollable wrapper element, I use addEventListener() to begin listening for "mousemove" events within this scrollable div.

Finally, in the handling of the "mousemove" event, we're going to do a couple of things:

  • Clear any existing mouseStopTimeout
  • Set a new mouseStopTimeout, which calls sendData() after 100 milliseconds

This accomplishes the mouse-pause handling I described above, in which we only want to send data to other participants once the mouse has stopped moving for 100ms. Each time the mouse moves, the timeout is reset.

Once the mouse has been stationary for 100ms, sendData() finishes the job:

function sendData(e, callback) {
  if (!contentDiv) return;
  // Send data relative to the user's position
  // in the content div
  const rect = contentDiv.getBoundingClientRect();
  const x = e.clientX - rect.x;
  const y = e.clientY - rect.y;
  callback(x, y);
}

Above, we get the mouse position relative to the content div (the one that holds the text) and execute the callback we provided from the Daily "joined-meeting" event handler with that position data:

      const callback = (x, y) => {
        callFrame.sendAppMessage({
          type: cursorUpdateMessageName,
          x,
          y,
        });
      };

sendAppMessage() is a Daily method to broadcast the given data to all other call participants.

Finally: our cursor position is broadcast for all to see. But how do other clients handle it? Let's cover the star of the show: the "app-message" event.

Handling remote cursor data with "app-message" events

When sendAppMessage() is called, relevant Daily video call participants receive an "app-message" event with the data which was sent. In the cursor sharing demo, the handler for this event looks as follows:

function handleAppMessage(callFrame, e) {
  // Retrieve data from the event
  const { data } = e;

  // If the eventtype is not what we expect, early out
  if (data.type !== cursorUpdateMessageName) return;

  // If there's no valid position data in the event data,
  // throw an error
  if (!data.x || !data.y) {
    throw new Error('invalid cursor position data');
  }

  // Retrieve participant who sent this message
  const p = callFrame.participants()[e.fromId];

  // Retrieve the user name of the participant
  // who sent this message, OR just their ID 
  // if they don't have a user name set.
  let userName = e.fromId;
  if (p && p.user_name) {
    userName = p.user_name;
  }
  // Update the participant's remote cursor
  updateRemoteCursor(e.fromId, userName, data.x, data.y);
}

You might be able tell from the in-line comments what the function is doing. In short, it performs the following:

  • Parses the event data (which is expected to be in a certain format)
  • Retrieves the name (or ID, if the name is not available) of the user who sent the event
  • Sends the above information to our cursor.js file, which updates the cursor visualization for that participant via updateRemoteCursor()

updateRemoteCursor() then creates or updates a div element for the given remote user's cursor visualization:

export function updateRemoteCursor(userID, userName, posX, posY) {
  if (!contentDiv) return;
  
  let cursorDiv = document.getElementById(getCursorID(userID));
  if (!cursorDiv) {
    cursorDiv = createCursorDiv(userID);
  }

  const text = `↖️ ${userName}`;
  if (cursorDiv.innerText !== text) {
    cursorDiv.innerText = text;
  }
  cursorDiv.style.left = `${posX}px`;
  cursorDiv.style.top = `${posY}px`;
}

Above I first retrieve the remote cursor div for the given user ID. If the div does not exist, I call createCursorDiv() to create one. This function appends the new div to our existing content div (the one that holds all the text).

Next, the content of the div is set with an arrow emoji and the user's name.

You might wonder why we don't do this just once when first creating the div. This is because in Daily Prebuilt, a user may change their user name at any time. We want their remote cursor to reflect that.

Finally, we update the CSS style of the div with the cursor's updated position. That's it!

I know what you're thinking: "But how does it move?!"

Creating smooth repositioning of the mouse cursor

We definitely don't want the cursor to just suddenly jump from one position to another. We want a smooth transition, so that users can see remote cursors gliding to their new position in the DOM.

We accomplish this with CSS. On creation, each cursor div is assigned a cursor class, which has the following styling applied:

.cursor {
    background-color: rgba(255,255,255,0.5);
    border-radius: 5px;
    position: absolute;
    transition: all 0.5s ease-in-out;
    color: blue;
    font-size: 14px;
}

The two most important properties here are position and transition. position: absolute ensures that the cursor is taken out of the flow of the rest of the div and positioned precisely at the specified coordinates without impacting the rest of the elements.

transition: all 0.5s ease-in-out creates the animation effect when going from one position to another. The cursor will take 0.5 seconds to move to its new position. The ease-in-out effect will make the cursor move a little slower toward the beginning and the end of the transition, instead of maintaining the same speed throughout. I thought this speed and transition setting created the nicest looking effect, but I'd suggest playing around with it yourself and seeing what you like best. You’ll probably also want to coordinate this transition time with the cursor pause time you choose for throttling the data you send (which we went over in the "mousemove" listener earlier) to get the nicest effect.

Leaving the meeting

You might remember that we also listen for "left-meeting" and "participant-left" events when setting up our Daily iframe.

"left-meeting" is invoked when the local participant leaves the Daily call. When this happens, we remove all cursor divs from the DOM and send the user back to the join form. You can check out this implementation on GitHub.

"participant-left" is invoked when a remote participant leaves the Daily call. In this case, we find and remove their specific cursor div—they’re not in the call anymore, so we don't want to leave their cursor hanging around, either.

Conclusion

In this post, we've gone through a small cursor sharing implementation using nothing but vanilla JavaScript, CSS, and Daily's video call API. I hope this helps you wrap your mind around one potential way to implement cursor sharing with Daily, and maybe sparks some ideas for other creative uses of Daily's "app-message" events.

If you have any questions, ideas, or feedback, come discuss this post on our Discord community. You can also find us on Twitter or reach out via email.

Never miss a story

Get the latest direct to your inbox.