building a peer-to-peer piano platform: pt 1

the orchestra problem

In this series, we’ll walk through the creation of a peer-to-peer piano platform named p2piano. The goal of p2piano is to create a website where anyone can create and join rooms to play piano together in real time, regardless of skill level. I’ve set some requirements for the project.

  1. Free to use without ads
  2. Input support for touch screen, keyboard, and midi device
  3. Audio synchronization across peers
  4. High quality piano audio

The most challenging aspect of this endeavor will be ensuring our musical timing is synchronized across collaborators, even with any delays over the internet. Fortunately for us, we can take some inspiration from symphony orchestras. They work very diligently to keep 80-100+ musicians in sync. Now, obviously they don’t have the problem of the delay inherent to the internet, but they do have a similar problem of the delay inherent to large physical spaces.

It takes time for sound to travel from one area of the stage to another. The farther a musician is from another, the more pronounced the sound delay between them. In practice, this can exceed 40 milliseconds, which would be noticeable by nearly any audience. To provide a clear sense of time, the musicians can’t rely on reacting to the timing of what they hear around them in the moment. Instead, they anticipate and play earlier than one would naturally. How early depends on how far away they are from the conductor. The farther away, the earlier that musician must anticipate and play. The conductor’s gestures and baton provide a shared reference point for the orchestra which enables this whole process.

To achieve this orchestra-like synchronization over the internet, we first need to think about how to send notes between users as fast as possible. This is essentially our way of trying to squish the orchestra onto a smaller stage to minimize lag. It’s common for real-time websites to send messages to a server which can then get relayed to another user. That server acts as an intermediary, and that intermediary step takes time. In our case, we’ll want to skip that intermediary and send note data directly between users to reduce latency. This is called peer-to-peer (P2P) networking and it was popularized by file sharing systems. First with Napster, and later on with BitTorrent. Since we’re building on the web, we’ll use WebRTC for the P2P communication.

system diagram of 5 clients connected in a p2p network

To get this network set up, we need a way to connect users in the same room together. In order to do that, we will need a server to coordinate the whole thing. So the flow would be as follows:

  1. User 1 creates a new room and shares the link with 4 friends
  2. The 4 friends join the room by clicking on the link
  3. The server notifies every user in the room when new users join
  4. When each user joins, the existing users send connection information to the new user through the server
  5. Users establish direct P2P connections using this exchanged information
system diagram of 5 clients connected in a p2p network, with each client also connected to a shared server

Now that we have networking taken care of, we’ll dive into the audio synchronization next.