It’s pretty hard to debate that 2020 was the year that real-time video exploded. That explosion was propelled by the proliferation of many popular video services from very big companies, like Google, Microsoft, and Zoom. More interestingly, it was also propelled by lots of exciting new or growing companies, many of which were powered by Daily’s video API.
So what’s the hot new thing for 2021? Well if you spend any time on Twitter, it would undoubtedly be a certain exclusive audio-only app that feels like a private "club" 😉. This year seems poised to do for audio, what 2020 did for video. Maybe it’s video call fatigue. Maybe it’s hype, maybe we all just need something a little easier, something that perhaps feels more personal.
The nice thing about making video calls accessible for developers is that we already have great support for audio. Isn’t it great when we can support something by turning features off?😎
So disable your camera, settle in, and let’s build a little audio chat demo in a single HTML file.
Getting started
If you would like to skip right to the end you can remix our sample app on Glitch by clicking here.
First off, you’ll need two things:
- A Daily account (Sign up at https://www.daily.co/)
- A Daily room to test with (Create one here https://dashboard.daily.co/rooms)
- A text editor (ok fine, three things)
Open the text editor of your choice and create and add the following:
<html>
<head>
<title>audio-only with WebRTC and the Daily API</title>
<script src="https://unpkg.com/@daily-co/daily-js"></script>
</head>
<body onload="main()">
</body>
</html>
This is the scaffolding of an HTML file. We’re telling it to load the daily-js
library from the unpkg
CDN, which will allow us access to all of that Daily video API goodness. One thing worth noting is the onload="main()"
which tells the browser to call main()
when all of the content on the page has loaded. More on that shortly.
Cameras off, please
With our scaffold in place, we’ll need some UI controls. Add the following inside the <body>
tag.
<div id="local-controls">
<button id="join" onclick="joinRoom()">join room</button>
<hr />
<button id="leave" onclick="leaveRoom()">leave room</button>
<hr />
<button id="toggle-mic" onclick="call.setLocalAudio(!call.localAudio())">
toggle mic state
</button>
<hr />
<div id="participants"></div>
</div>
Inside our local-controls
div
we’ve added buttons to join, leave, and toggle mic state. Each of these has its respective onclick
handler which calls a function. We’ll get to those in a minute.
First, let’s set up our main()
function. We’ll add this after the HTML block above.
<script>
async function main() {
// CHANGE THIS TO A ROOM WITHIN YOUR DAILY ACCOUNT
const ROOM_URL = "INSERT YOUR ROOM URL HERE";
window.call = DailyIframe.createCallObject({
url: ROOM_URL,
// audioSource can be true, false, or a MediaStreamTrack object
audioSource: true,
videoSource: false,
dailyConfig: {
experimentalChromeVideoMuteLightOff: true
}
});
call.on("joined-meeting", () => console.log("[JOINED MEETING]"));
call.on("error", e => console.error(e));
call.on("track-started", playTrack);
call.on("track-stopped", destroyTrack);
call.on("participant-joined", updateParticipants);
call.on("participant-left", updateParticipants);
}
// subsequent code goes here
</script>
Here, we’re hardcoding the Daily room (in ROOM_URL
) that we told you to create above (head to your Dashboard if you forgot to do that). If you plan to deploy this app to production, though, you'll want to create rooms dynamically, read more about that here.
Next up we’re instantiating a Daily call via the createCallObject
factory method and storing it on the window
object. This allows us to access any daily-js
method using call
wherever we need. Here, we’re passing the following options:
url: ROOM_URL
- our room URLaudioSource: true
- as noted in the comment this can be abool
or aMediaStreamTrack
,true
in our casevideoSource: false
- same as above butfalse
since this is a camera free zone 🚧dailyConfig: { experimentalChromeVideoMuteLightOff: true }
- this keeps the camera light off when we request device permissions but never turn on the camera
Lastly, we’re wiring up event callbacks via the call.on
method. This is where most of the Daily magic happens, so we’ll go over each of those callbacks in due time.
Should I stay or should I...leave?
Remember those onclick
handlers we talked about earlier. We need to create the functions that they will call. After the body of the main()
function add the following:
async function joinRoom() {
await call.join();
}
async function leaveRoom() {
await call.leave();
}
These are async
since the respective Daily methods they are calling (join()
and leave()
) , each return Promises and we want to await
the result.
Let the audio (or maybe music 🎶) play!
Now let’s look at the callback functions that we wired up for the various events inside the main()
function. Next add the following:
function playTrack(evt) {
console.log(
"[TRACK STARTED]",
evt.participant && evt.participant.session_id
);
// sanity check to make sure this is an audio track
if (!(evt.track && evt.track.kind === "audio")) {
console.error("!!! playTrack called without an audio track !!!", evt);
return;
}
// don't play the local audio track (echo!)
if (evt.participant.local) {
return;
}
let audioEl = document.createElement("audio");
document.body.appendChild(audioEl);
audioEl.srcObject = new MediaStream([evt.track]);
audioEl.play();
}
If you remember from above, every time we get a track-started
event, we will call our playTrack
function. We start things off with a console.log
to help with debugging. Next we check that the track we’re getting is an audio track and return early if it’s not. After that we check that the track is not from the local participant, because you don’t want to hear yourself.
And finally once we’re sure we have an audio track from a remote participant, we create a new audio element using createElement
. Then we append it to the body
using appendChild
. Next we set the srcObject
to be a new MediaStream
created from the track that was passed to our event handler. And finally, we tell it to play()
.
Stay up to date or be destroyed
As you may have noticed in the HTML there was an empty div
with the id
participants
. You may also have noticed event handlers for participant-joined
and participant-left
. Let’s look at the shared callback for both of those. Add the following next:
function updateParticipants(evt) {
console.log("[PARTICIPANT(S) UPDATED]", evt);
let el = document.getElementById("participants");
let count = Object.entries(call.participants()).length;
el.innerHTML = `Participant count: ${count}`;
}
This function gets called every time a participant joins or leaves the call. It updates the contents of the #participants
div with a counter that is calculated by calling participants()
, converting it to an array using Object.entries
, and taking the length
of that array.
The last callback function is destroyTrack
which does some clean up. Add the following:
function destroyTrack(evt) {
console.log(
"[TRACK STOPPED]",
(evt.participant && evt.participant.session_id) || "[left meeting]"
);
let els = Array.from(document.getElementsByTagName("video")).concat(
Array.from(document.getElementsByTagName("audio"))
);
for (let el of els) {
if (el.srcObject && el.srcObject.getTracks()[0] === evt.track) {
el.remove();
}
}
}
This is called when we receive a track-stopped
event. First we create an array of video
and audio
elements. Then we loop through it and remove()
the ones that match those that correspond to the event.
Summary and next steps
And that’s it! If all went well, you now have the makings of your own exclusive, bespoke, audio chat.
You can check out a working example on Glitch, and also play around with the code in their editor.
Of course, what we’ve demonstrated here is just a starting point. Some suggestions for ways to improve this are:
- Better error handling
- A more fully featured UI (ours is quite spartan)
- More user roles (Host, moderator, listeners, etc)
We will definitely be exploring this audio-only use case in more depth in the coming weeks so stay tuned for more exciting content and reach out if you have any questions.