Daily makes native mobile SDKs for iOS, Android, and other mobile platforms built on a common library written in Rust. This approach lets teams have access to the same features on all platforms, with a familiar API. You can learn more in our mobile SDK resources post.
We’re now introducing another resource for our iOS SDK: Daily’s iOS Starter Kit. In this post, I’ll give you an overview of the starter kit. In a follow-up post, I’ll delve into additional technical details about its implementation.
What is Daily’s iOS Starter Kit?
Daily’s iOS Starter Kit is an example of how to build a video conferencing app with our Client SDK for iOS and SwiftUI. We will cover topics at a higher level, so some experience with Swift and iOS development will be helpful. We have not used any third-party libraries, so you can more easily see how to use our iOS SDK with frameworks you are already familiar with. This starter kit could be used as a starting point for a new app, but a production app would require additional performance optimization and robust error handling.
Getting started
We will go through some examples of different components of the starter kit. If you want to follow along as we go, prepare the following:
- Clone the repository.
- Sign up for a Daily account.
- Create a Daily room, and make note of the URL.
To start with, let’s go through a few key concepts.
Key concepts
SwiftUI
SwiftUI is a declarative user interface (UI) framework that Apple introduced at WWDC 2019. In contrast to an imperative framework, such as UIKit, SwiftUI apps describe the user interface as a large tree of state. The framework updates the UI whenever that state changes. This approach ensures the UI is always in sync with app state and can greatly reduce the amount of code needed to support other Apple platforms.
Daily’s iOS SDK CallClient
The Daily call client provides everything you need to build a video call app. It is used to join and leave meetings, control the local camera and microphone, and enable many other operations that you can read about in our iOS SDK documentation. At its core, the CallClient
class has two primary responsibilities:
- Provide state about the call and its participants
- Handle user actions from the UI
Daily call state
The call client exposes a large variety of state that can be observed for changes. We will primarily use values from the CallState
enum and Participant
struct. The CallState
value will change when the local participant joins or leaves a call. We use it to know which screens to show before and during a video call.
Each person in the call is represented by a Participant
instance. This enables access to the participant-specific data, including their video tracks, name, and mute state of their devices.
Daily call client actions
Actions are things we can ask the call client to do on behalf of the user of our app. The starter kit app’s UI contains buttons to perform the following actions:
- Joining or leaving a call
- Toggling the microphone and camera
- Switching between the front and back cameras
- Choosing a wireless speaker
- Copying a link to the meeting
- Viewing details about the other participants in the call
When an app user taps one of these buttons, we call a corresponding method on an instance of the CallClient
class, such as join()
or setInputEnabled()
. After one of these methods is called, the state within the call client will be updated, and we will then update the UI.
Now that we’ve covered the key concepts, let’s go through a high-level overview of the starter kit’s core components.
Core components
CallManager
The CallManager
class is the source of truth for UI state .It contains the core business logic of the app. To support testing and the preview canvas, we have an accompanying FakeCallManager
. Both the CallManager
and FakeCallManager
classes conform to the CallManageable
protocol:
protocol CallManageable {
// The current value of the Daily `CallState` enum.
var callState: CallState { get }
// A struct we have created to hold a simplified version of the
// participant state published by the Daily call client.
var participants: CallParticipants { get }
// More properties ...
// Joins a call with the specified url.
func join(url: URL)
// Leaves the current call.
func leave()
// More methods ...
}
Join screen
First, our app needs a way for the local participant to join a call. Let’s walk through our implementation of the join screen below:
We will only show portrait layout code in this section to keep the examples simpler, but you can see the corresponding landscape layout code in the repo.
When the Join button above is tapped, a joinButtonTapped()
method is called on the view’s model
property, which then calls the join()
method defined by the CallManageable
protocol.
Note that the local participant’s video is shown in full at its native aspect ratio, without any cropping. This ensures anyone using the app knows what the other participants can see, which is a crucial privacy consideration.
In the following implementation for the join screen, we created buttonView
and similarly named properties for views that have different positions in the portrait and landscape layouts.
struct JoinLayoutView: View {
// ...
@EnvironmentObject private var model: JoinLayoutModel
// ...
var body: some View {
ZStack {
Colors.backgroundPrimary
.ignoresSafeArea()
VStack(spacing: 16) {
titleView
ZStack {
participantView
VStack {
inputView
Spacer()
}
.padding(4)
}
.aspectRatio(
layout.localVideoAspectRatio,
contentMode: .fit
)
buttonView
}
.padding(
EdgeInsets(top: 16, leading: 32, bottom: 16, trailing: 32)
)
.ignoresSafeArea(.keyboard)
}
}
// ...
private var buttonView: some View {
Button("Join meeting") {
model.joinButtonTapped()
}
.frame(maxWidth: .infinity)
.buttonStyle(PrimaryButtonStyle())
.disabled(model.isJoinButtonDisabled)
}
// ...
}
In addition to creating our view, we have also set up a preview provider using an instance of the FakeCallManager
class, which lets us iterate on the UI without needing to build and run to the simulator.
struct JoinView_Previews: PreviewProvider {
static var previews: some View {
ContextView(callManager: FakeCallManager()) {
JoinLayoutView()
}
}
}
Using fakes or other test doubles in previews has many benefits, such as:
- No unnecessary network calls will be made when the preview is rendered.
- Previews can be used without an Internet connection.
- Method calls are very fast.
- Creating state for edge cases is easy.
- Multiple previews can be created for different device types and orientations without those previews conflicting with each other.
With the view and preview setup, everything can be wired up in the model. In the model below, we determine the value for isJoinButtonDisabled
based on the current CallState
value value, which can be initialized
, joining
, joined
, leaving
, or left
. After the local participant taps the Join button, it will be disabled until either the call connects and transitions to joined
, or an error occurs and the call transitions to left
.
class JoinLayoutModel: ObservableObject {
init(manager: CallManageable) {
// ...
// Disable the join button in the `joining` and `joined` states.
manager.publisher(for: .callState)
.map { [.joining, .joined].contains($0) }
.assign(to: &$isJoinButtonDisabled)
// ...
}
@Published private(set) var isJoinButtonDisabled: Bool = false
func joinButtonTapped() {
// ...
// Ask the `CallClient` to join the meeting.
manager.join(url: meetingURL)
// ...
}
}
Once the participant is in the call, the entire view we just created will be replaced with a different layout and the call controls. Let’s take a closer look at the in-call views now.
In-call views
After the local participant has joined the call, they will be presented with different layouts managed by an instance of InCallView
, as seen below.
Both the view above and JoinLayoutView
, which we discussed in the previous section, are managed by an instance of CallContainerView
. CallContainerView
transitions between the two when the CallState
value changes. InCallView
itself is responsible for transitioning between different layout views based on the number of participants in the call. It also shows and hides CallControlsOverlayView
, which contains the buttons at the top and bottom of the screen.
While most interactions in our app indirectly invoke a method on a CallClient
instance, UI-specific behavior can be handled directly in models themselves. The background tap that shows and hides the call controls simply updates the opacity of the call controls based on the current opacity value.
class InCallModel: ObservableObject {
// ...
// Visibility of the call controls based on user interaction.
@Published private(set) var callControlsOpacity: CGFloat = 1
func backgroundTapped() {
// Toggle the call controls whenever the background is tapped.
callControlsOpacity = callControlsOpacity.isZero ? 1 : 0
}
// ...
}
We’ve now covered the general structure of our view hierarchy, how to join a call, and how to show and hide the call controls. Next, let’s look at displaying remote participants in a grid view.
Grid screen
The primary screen in our app is a grid layout, in which we always show the local participant. Remote participants are shown in a dynamically resizing grid:
The grid starts as a single tile when the first remote participant joins the call. It grows up to 2x3 tiles on iPhone and 3x4 tiles on iPad as each subsequent participant joins. If there are more participants in the call than the maximum number of tiles, we replace the least recently visible participant with the active speaker. This ensures that the most active participants are visible. If a participant starts sharing their screen, the track with their camera video will be replaced by their screen share content.
Grid dimensions are determined by a GridGeometry
struct with properties for the row and column counts, as seen below:
struct GridGeometry: Equatable {
let rowCount: Int
let columnCount: Int
}
The state for the grid layout is provided by the following CallParticipants
struct:
struct CallParticipants: Equatable {
var count: Int { 1 + remote.count }
let local: CallParticipant
let remote: [CallParticipant.ID: CallParticipant]
let visible: [Int: CallParticipant]
func stabilize(_ other: CallParticipants) -> CallParticipants {
// ...
}
}
The most important things to understand about this type are:
- The
visible
dictionary contains only the currently visible participants, which ensures the view will not need to be re-rendered when non-visible participants change. - The keys of the
visible
dictionary are integers that represent the related participant’s position within the grid. Using a dictionary will let us leave some grid positions empty, which can be useful for more advanced animations. - The
stabilize()
method returns a copy of itsother
argument that preserves the existing grid position of any participants contained in the receiver. This ensures that any participants that are visible do not move to different grid positions when they speak, which can be both distracting and an accessibility issue for motion sensitive folks.
That completes our overview of the grid screen. Grab some of your friends, and try it out for yourself.
Conclusion
In this post, we learned how to build a video call app with the Daily iOS SDK and SwiftUI. We will add more features to the starter kit over time, so you may want to watch the repo to get notified of future updates. If you have any questions about the starter kit or Daily’s mobile SDKs, please feel free to contact our support team or join our Discord community.