Monthly Archives: January 2017

SignalR Core Part 2/3: ASP.Net Core Sockets

Disclaimer: SignalR Core is still in early stages and changing rapidly. All information in this post is subject to change.

To test some of the scenarios described in the first part of this mini-series I came up with an idea for a relatively simple application where users can report to the server the weather at their location and their report will be broadcast to all the connected clients. I called this application SocialWeather. The central part of the system is an ASP.Net Core application running SignalR Core server. The server can handle massages received in one of the following formats – JSON, Protobuf and pipe (the pipe format is a simple format I created where the data is separated by the pipe (|) and the message ends with the new line character (\n)). I also created 3 different clients – a JavaScript clients using the JSON format, a C# client using Protobuf and a lua client using the pipe format. The JavaScript client is part of a web page served by the same application that hosts the server. The C# client is a console application that that can send and receive Protobuf messages. The lua client is the most interesting as it runs on an ESP8266 development board with the NodeMCU firmware. The whole system is using the “socket” level of the new SignalR which is a kind of counterpart of persistent connection in the previous version of SignalR (so no hubs API here). All the clients use bare websockets to connect to the server (in other words there is no SignalR client involved).

The SocialWeather server, which includes the JavaScript client is a sample project in the SignalR repo and you can run it yourself – just clone the repo, run build.cmd/build.sh to install the correct version of the runtime and restore packages. Then go to the samples/SocialWeather folder and start the server with dotnet run.

The C# and lua clients are in my personal repo. Running the C# client is straightforward. After you clone the repo you need to restore packages update the URL to the SignalR server and run the client with dotnet run. Each time you press Enter the client will generate a random weather report and send it to the server. If you type “!q:” the client will exit.

The lua client is meant to run on an ESP8266 compatible board with NodeMCU firmware installed. Preparing the board to run the client requires a bit of work. The first step is to set up serial communication to the board. If the module you have is equipped with a USB port (I have the Lolin v3 board which does have a USB port) it should be enough to install a VCP (Virtual COM Port) drivers (you can find the drivers here http://www.ftdichip.com/FTDrivers.htm or here http://www.silabs.com/products/mcu/pages/usbtouartbridgevcpdrivers.aspx). If your board doesn’t have a USB port, you will need to use an additional module (e.g.  Arduino)  for USB-to-serial translation (you can find tutorials about setting it up on the web).

The next step is to make sure that your board has the right firmware. It needs to be a NodeMCU firmware with net, http, wifi, and websocket support. You can request a build here and follow steps in this tutorial to re-image your board. When the board is ready you should be able to connect to it using a serial terminal (I used screen on Mac and Putty on Windows). On Windows determine the COM port number the board is using using the Device Manager. On Mac it will be one of the /dev/tty* devices. The speed needs to be set to 115200. When using Putty make sure the Flow Control setting is set to None or the communication will not be working correctly.

puttyflowcontrol

Once on the device you need to connect it to your wireless network by sending the following commands (you need to replace SSID with the name of the network and PWD with the password or an empty string for open networks):

wifi.setmode(wifi.STATION)
wifi.sta.config(SSID, PWD)
wifi.sta.connect()

We are now ready to run the client. If you haven’t already, clone the repo containing the client go to src/lua-client folder and update the URL to the SocialWeather SignalR server. Now you can transfer the file to the device (if you are connected to the device with the terminal you need to disconnect). The nodemcu-uploader Python script does the job.

If all stars aligned correctly you should be able now to start the client by executing:

dofile(“social-weather.lua”)

it will print a confirmation message once it connected  successfully to the server and then will print weather reports sent by other client. You can also send your own weather report just by typing:

ws:send(“72|2|0|98052\n”)

and pressing Enter.

The command may look a bit cryptic but is quite simple. The ws is a handle to the websocket instance created by the social-weather.lua script. send is a method of the websocket class so we literally invoke the send method of websocket interactively. The argument is a SocialWeather report in the pipe format: a pipe separated list of values – temperature, weather, time, zipe code – terminated with \n.

This is what it looks like when you run all three clients:

social-weather

Let’s take a look at how things work under the hood. On the server side the central part is the SocialEndPoint class which handles the clients and processes their requests. If you look at the loop that processes requests it does not do any parsing on its own. Instead it offloads parsing to a formatter and deals with strongly typed instances. Formatter is a class that knows how to turn a message into an object of a given type and vice versa. In case of the SocialWeather application the only kind of messages sent to and from the server are weather reports so this is the only type formatters need to understand.

When sending messages to clients the process is reversed. The server gives the formatter a strongly typed object and leaves it to formatter to turn it into a valid wire format.

How does the server know which formatter to use for a given connection? All available formatters need to be registered in the DI container as well as mapped to a type they can handle and format. When establishing the connection, the client sends the format type it understands as a query string parameter. The server stores this value in the connection metadata and uses later to resolve the correct formatter for the connection.

On the client side things are even simpler. The lua client has just 30 lines of code half of which is concerned with printing weather reports in a human readable form. Because the format of the connection cannot change once the connection is established message parsing can be hardcoded. The rest is just setting up the websocket to connect to the server and react to incoming message notifications.

The C# client is equally simple. It contains two asynchronous loops – one for receiving messages (weather reports) from the server and one for sending messages to the server. Again, handling the wire format (which in this case is Protobuf) is hardcoded in the client.

This is pretty much all I have on ASP.Net Core Sockets. With a simple application we were able to validate that the new version of SignalR can handle many scenarios the old one couldn’t. We were able to connect to the server from different platforms/environments without using a dedicated SignalR client. The server was capable of handling clients that use different and custom message formats – including a binary format (Protobuf). Finally, all this could be achieved with a small amount of relatively simple code.

Advertisements

SignalR Core Part 1/3: Design Considerations

Disclaimer: SignalR Core is still in early stages and changing rapidly. All information in this post is subject to change.

A few months ago, we started working on the new version of SignalR that will be part of the ASP.Net Core framework. Originally we just wanted to port existing code and iterate on it. However, we soon realized that doing so would prevent us from enabling new scenarios we wanted to support. The original version of SignalR was designed around long polling (note that back in the day support for websockets was not as common as it is today – it was not supported by many web browsers, it was not supported in .NET Framework 4, it was not (and still isn’t) supported natively on Windows 7 and Windows 2008 R2). A JSON based protocol was baked in and could not be replaced which blocked a possibility of using other (e.g. binary) formats. Starting the connection was heavy and complicated – it required sending 3 HTTP requests whose responses had to be correlated with messages sent over the newly created transport (you can find a detailed description of the protocol in SignalR on the wire – an informal description of the SignalR protocol – a post I wrote on this very subject). This basically meant that a dedicated client was required to talk to a SignalR server. In the old design the server was centered around MessageBus – all messages and actions had to go through the message bus. This made the code very complex and error prone especially in scale-out scenarios where all the servers were required to have the same data. The state (e.g. cursors/message ids, groups tokens etc.) was kept on the client which would then send it back to the server when needed (e.g. when reconnecting). The need of keeping the state up-to-date significantly increased the size of the messages exchanged between the server and the client in most of the non-trivial scenarios.

In the new version of SignalR we wanted to remove some of the limitations of the old version. First, we decided to no longer use long polling as the model transport. Rather, we started with a premise that a full duplex channel is available. While this might sound a lot of like websockets we are thinking that it will be possible to take it further in the future and support other protocols like TCP/IP. Note, it does not mean that the long polling and server sent events transports are going away. Only, that we would not drag better transports down to the standards of worse transports (e.g. websockets supported binary format but long polling (until XmlHttpRequest2) and server sent events didn’t so in the old version of SignalR there was no support for binary messages. In the new version we’d rather base64 encode messages if needed and let users use what websockets offers). Second, we did not want to bake in any specific protocol or message format. Sure, for hub invocations we will still need to be able to get the name of the hub method and the arguments but we will no longer care how this is represented on the wire. This opens the way to using custom (including binary) formats. Third, establishing the connection should be lightweight and connection negotiation can be skipped for persistent duplex transports (like websockets). If a transport is not persistent or uses separate channels for sending and receiving data connection negotiation is required – it creates a connection id which will be used to identify all the requests sent by a given client. However, if there are no multiple requests because the transport is full duplex and persistent (like in the case of websockets) the connection id is not needed – once the connection is established in the first request it is used to transfer the data in both directions. In practice, this means that you can connect to a SignalR server without a SignalR client – just by using bare websockets.

There are also a few things that we decided not to support in the new SignalR. One of the biggest ones was the ability re-establish a connection automatically if the client loses a connection to the server. While it may not be obvious, the reconnect feature has a huge impact on the design, complexity and performance of SignalR. Looking at what happens during reconnect should make it clear. When a client loses a connection, it tries to re-establish it by sending the reconnect request to the server. The reconnect request contains the id of the last message the client received and the groups token containing the information about groups the client belongs to. This means that the server needs to send the message id with each message so the client can tell the server what was the last message it received. The more topics the client is subscribed to the bigger the message id gets up to the point where the message id is much bigger than the actual message.

Now, when the server receives a reconnect request it reads the message id and tries to resend all the messages the client missed. To be able to do that the server needs to keep track of all messages sent to each client and buffer at least some recent messages so that it can resend them when needed. Indeed, the server has a buffer per connection which it uses to store recent messages. The default size of that buffer is 1000 messages which creates a lot of memory pressure. The size of the buffer can be configured to make it smaller but this will increase the probability of losing messages when a reconnect happens.

The groups token has similar issues – the more groups the client belongs to the bigger the token gets. It needs to be sent to the client each time the client joins or leaves a group so the client can send it back in case of reconnects to re-establish group membership. The size of the token limits the number of groups a client can belong to – if the groups token gets too big the reconnect attempt will fail due to the URL being bigger than the limit.

While auto-reconnect will no longer be supported in SignalR users can build their own solution to this problem. Even today people try restarting their connection if it was closed by adding a handler to the Closed event in which they start a new connection. It can be done in a similar fashion in SignalR core. It’s true that the client will no longer receive messages it missed but this could happen even in the old SignalR – if the number of message the client missed was greater than the size of the message buffer the newest messages would overwrite the oldest messages (message buffer is a ring buffer) so the client would never receive the oldest messages.

Another scenario we decided not to support in the new version of SignalR was allowing clients to jump servers (a multi-server scenarios). Before, the client could connect to any server and then reconnect or send a data to any other server in the farm. This required that all servers had all the data to be able to handle requests from any client. The way it was implemented was that when a server receive a message it would publish it to all the other SignalR servers via MessageBus. This resulted in a huge number of messages being sent between SignalR servers.

(Side note. Interestingly, the scenario of reconnecting to a different server than the one the client was originally connected to often did not work correctly due to server misconfiguration. The connection token and groups token are encrypted by the server before sending them to the client. When the server receives the connection token and/or groups token it needs to be able decrypt it. If it cannot, it rejects the request with the 400 (Bad Request) error. The server uses the machine key to encrypt/decrypt the data so, all machines in the farm must have the same machine key or the connection token (which is included in each request) encrypted on one server can’t be decrypted on another server and the request fails. What I have seen several times was that servers in the farm had different machine keys so, reconnecting to a different server did not actually work.)

In the new SignalR the idea is that the client sticks only to one server. In multi-server scenarios there is a client to server map stored externally which tells which client is connected to which server. When a server needs to send a message to a client it no longer needs to send the message to all other servers because the client might be connected to one of them. Rather, it checks what server the client is connected to and sends the message only to this client thus the traffic among SignalR server is greatly reduced.

The last change I want to talk about, somewhat related to the previous topic, is removing the built-in scale-out support. In the previous version of SignalR there was one official way of scaling out SignalR servers – a scale out provider would subclass the ScaleoutMessageBus and leave all the heavy lifting to SignalR. It sounds good in theory but with the time it became apparent that with regards to scale-out there is no “one size fits all” solution. Scaling out an applications turned out to be very specific to the application goals and design. As a result, many users had to implement their own solution to scaling out their applications yet still paid the cost of the built-in scale-out (even when using just one server there is an in-memory message bus all messages go through). While, scale-out support is no longer built-in the project contains a Redis based scale-out solution that can be used as-is or as a guidance to create a custom scale-out solution.

These are I think the biggest design/architecture decision we have made. I believe that they will allow to make SignalR simpler, more reliable and performant.