At Tablelist we take our iOS products very seriously. Our consumer application drives a large portion of our revenue, and our new native app for NightPro is being used by venue partners across the United States to operate their Nightclubs and lounges. Both Tablelist and NightPro talk to the same API, even using some of the same endpoints. They also have very similar needs: The ability to login, securely store an authentication token, list venues, read reservations etc. We knew early on that we would need to share code between both apps so naturally we set out to build our own SDK that could abstract away talking to the central API and all the data access layers. A less obvious need, and perhaps the most important, was the ability to operate in bad network conditions and even offline. This lead to a very strategic decision in how the SDK was designed and also what third parties we would leverage to make it happen.

It Starts with Realm

Our most important dependency in the SDK is Realm. Realm is a lightening fast database designed specifically for mobile clients. It's an object database through and through, giving us the ability to work with real Swift classes from the start. It supports all the things you would expect like relationships between objects, subclassing, and a killer threading model. I'll save all the details of Realm for their development docs (they're excellent) but the important takeaway is that in order to have an offline application you need a way to save application state and efficiently restore it, Realm is our answer.

A Consistent API

Just because we are designing an SDK for offline use does not mean it shouldn't work well when it's online. And by "work well" I'm not talking about performance, but the API that a developer would use when building an app needs to be consistent in the online and offline cases. The most obvious thing to avoid is relying on a closure after a network request to display some data. For example:

override func viewDidLoad() {  
    super.viewDidLoad()

    EventService().list(in: city) { events in
        self.events = events
        self.tableView.reloadData()
    } 
}

func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int {  
    return events.count
}

This code is directly relying on the network call to list events in order to show data. What we need to do is switch our model to only care about the events that we already have in our database, and the network call can simply populate the database. Something like this:

let resultsManager = ResultsManager<Event>()

override func viewDidLoad() {  
    super.viewDidLoad()

    resultsManager
        .filter("city", isEqualTo: city)
        .observe { _ in self.tableView.reloadData() }

    EventService().list(in: city)
}

func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int {  
    return resultsManager.results.count
}

Whoa okay, we have a few things to digest here. First of all, that ResultsManager class is coming from our SDK, and it's the most used class in all of our apps. It's a wrapper around Realm Results that makes it really easy to ask for a collection of data. It's a generic class, and in this case we want it to query for Events. But notice how it is initialized where it is declared. This is important, it means we have access to events already in the database as soon as this controller loads. In the first example we didn't have access to any events until after the network request finished. Even in an online scenario there is a huge performance gain for returning users because they will see content immediately.

So what happens when the user first runs the app, how do the events show up? Well thats where the observe call comes into play. Realm has some serious magic behind the scenes where the database can tell you when things have changed. Even more magical, that observe call will only be run when event's in the given city have changed. This is why we no longer need a closure attached the EventService.list call, the EventService has the know how to put those objects in the Realm database, and Realm will automatically let us know when we need to update our UI.

Our SDK enforces this mindset throughout all of its modules, to the point that we don't event let a caller pass in a closure like we did in the first example. Instead we return Promises from all of our service level methods, but thats a blog for another day.

Going Completely Offline

If you stopped here, you would have a pretty robust offline experience for browsing around and viewing content, but NightPro needs to go a step further than our consumer application, it needs to save and update data completely offline. Operating a cloud based application inside a Nightclub is no easy feat: cell service is usually non-existent and let's just say they don't all have mesh wifi networks throughout the club.

First lets look at what a typical save call would like in our app:

@IBAction func handleSaveButtonTapped(_: Any?) {
    saveReservation(self.reservation)
}

func saveReservation(_ reservation: Reservation) {  
    ReservationService().save(reservation)
        .then { self.handleReservationSaved($0) }
        .catch { self.alertError($0) }
}

Now we want to convert this code to properly handle an offline state. You might be thinking that we should check if we're offline before we send the request to save the reservation, and you'd be wrong! That is one of the biggest misconceptions of handling offline requests, you should always send the network request even if you absolutely know you are offline. The right thing to do is try and recover from the failed request, and handle the offline case if appropriate:

@IBAction func handleSaveButtonTapped(_: Any?) {
    saveReservation(self.reservation)
}

func saveReservation(_ reservation: Reservation) {  
    ReservationService().save(reservation)
        .catchThen { error in
            guard OfflineService.shouldHandleErrorOffline(error) else { throw error }
            return OfflineService().handleSaveOffline(for: reservation)
        }
        .then { self.handleReservationSaved($0) }
        .catch { self.alertError($0) }
}

There are 3 important pieces to this change. 1st, we introduced a handler in-between the original save call and our success handler. This allows the OfflineService to try and recover from the failed network request. If it is able to recover, then the original success handler will be called as if everything worked correctly. This is a huge advantage in that we don't need to change any other application code to handle the offline case, our code gets to behave exactly as if it were online.

Inside the handler, the first thing we check is if we should even handle the error as an offline error. In our SDK we consider an offline error to be either an incomplete request (invalid or no response) or a 503 error from our API. These two cases are both equally important, being offline doesn't just mean the device doesn't have a network connection, but could also mean that we have a good internet connection and our API is down. By handling both cases we can ensure our venue partners have smooth operations in both worst case scenarios.

Finally, we need to save the save the request that failed so we can replay it when we regain connectivity. On top of that, we also need to create or update the local object in Realm so we can continue using it while offline. This is all handled by the OfflineService which has 3 top level functions for creating, updating and deleting resources. As you might imagine, those operations directly map to HTTP verbs that we will send to our API. All 3 of those functions to the same basic operation: they create and save an Offline object. A simplified version of our Offline class looks like:

public final class Offline: RealmObject {

    public enum Action: String {
        case create = "CREATE"
        case update = "UPDATE"
        case delete = "DELETE"
    }

    public enum State: String {
        case pending = "PENDING"
        case complete = "COMPLETE"
        case failed = "FAILED"
    }

    public dynamic var createdAt: Date = Date()

    public dynamic var state: String = State.pending.rawValue

    public dynamic var action: String = Action.create.rawValue

    public dynamic var requestUrl: ObjectURL = ""

    public dynamic var requestBody: Data?

    public dynamic var responseBody: Data?
}

Storing the offline requests in Realm gives us some big advantages. Firstly, the user is safe to quit (or even crash) the app without losing any of their offline changes. We can also easily sort the offline requests by their createdAt property for easy replay in the order they were added. Finally, we get to query the offline objects with all the same robustness that we have for our other objects in Realm. This lets us set up a simple offline queue:

let offlineQueue = ResultsManager<Offline>()  
    .filter("state", isEqualTo: Offline.State.pending)
    .sorted(byKeyPath: "createdAt", ascending: true)

And recursively process the queue with little work:

func processQueue() {  
    guard let offline = offlineQueue.results.first else { return }

    process(offline)
        .then { 
            self.setOfflineState(.complete, for: offline, response: $0) 
        }
        .catch { 
            self.setOfflineState(.failed, for: offline, response: $0) 
        }
        .finally { 
            self.processQueue() 
        }
}

The reason we can always grab the first offline object from the queue is because Realm results are live updating on the thread they were created in. As soon as we set the state to a non-pending state, the results will not include that object.

More to Think About

Every example I've shared is greatly simplified for the sake of this post, so I wouldn't go copying any code as is. There are some core things I left out of this post that need to be solved to have a truly robust offline experience:

  1. Handling conflicts when two offline users edit the same resource
  2. Creating an object, getting a real ID back from the server, and then using that ID for subsequent changes
  3. Sending proper headers so the backend knows to treat the offline request accordingly
  4. Properly alerting the user when an offline request fails so the user can take proper action

Those are just some of the things we've had to handle but there are many more and you will need to research the most important depending on your use case. The most important thing we've learned when building Offline for NightPro is that there is no perfect way to handle offline for everyone. It's a tricky problem to get right, and we've heavily optimized the experience based on how we know users use our product. For example, conflict resolution for us is much simpler than it might be for a more collaborative application.