RUBIK Preliminary

5 minute read

Published:

Rookie training

Professor Li Hao was framing a preliminary idea about creating an abstract frame to let user construct a customizable network stack processor laying on layer 2-4, inspired by mOS published at NSDI 2017 conference. Since it is just a first thinking, prelim is required. And after a couple meeting with Li Hao, he asked me to work on programming a network processor that can parser IP, TCP headers and be capable to handle IP fragmentations and TCP segments ordering. At that time I was a sophomore student and a rookie on networking area, therefore learning IP, TCP protocol was exacting and programming stateful TCP protocol in C language was even suffering. However, after summer holiday, I finished my own IP-TCP stack and from this implementation I became familiar with networking protocol and state machine model, which is a critical part in networking field.

naive goal

After I completed my IP-TCP stack, professor and I became confident that our goal to build a abstract framework to describe network was more viable. Therefore, professor Li and I start to think what is a good way to build this abstract. I came up with the idea that a domain specific language may be a appoarch to this question. Thus, I followed IP-TCP stack, since IP and TCP protocols are widely deployed protcol, which definitely have virtue and advtanages, to design my very slapdash language (click to view). And in the mean time I programmed a python interpreter to actually compile this language to C code, which took me about a semester. To be honest, this script is very obscure and lack of elegance in term of visual aesthetic. However, when I finished my own language, though ugly, I built the model to adstract network protocol models, which is very crucial for our further research. In my first design, as you can see in my script, model describing networking protocol is composed of five parts. First, of course, is the name of the protocol. Second part is header format, which determines the composition of the header and is used for further reference. Third part is data, which carries the permanent data and temporary data. Permanent data will be kept track of during a stream or connection’s lifespan and temporary data is used in per packet scope and basically for temporary arithmetic use. The forth part is configuration, which includes settings for timer, hashtable, buffer management and so on. The fifth part is state machine part, which is the skeleton of networking protocol. For more details, click here for a detailed description.

my sentiment of my first stage work on RUBIK

Doing research with Professor Li is my first time that, instead of studying knowledge, I actually design a brand new thing in my life. And it is quiet different than what I thought to be. First thing for a new design is that a idea is so intangible at the very begining that I was unconfident to do the experiment or implementation. At the first time, when professor Li Hao told me his idea about designing an abstract for networking protocol, what in my mind was totally a blank. Besides I was a newbie in computer networking area, I felt this idea is like a dream, it was so imaginative that I did not even know where should I start with. However, professor Li pointed me a very clear way to work on, which led me to build my own TCP/IP stack. Therefore as I had my basic knowledge about network, it became much easier to have my own abstract of network protocol in my brain. In retrospective, I realized that to make a idea tangible, the first thing I should do is to forget the exacting and time-consuming work to achieve my idea, but to focus on the basis of the idea. Aftering having built the cornerstone, I will acquire a much clearer thought about my idea, which may correct some mistakes in my naive idea, and form a much transparent roadmap for my further work about the idea.

Second, designing a abstract of networking protocol is much different with studying network protocol. The difference is not specified only in network study, but is the general differenece between design and knowledge acquisition. When I was studying IP-TCP protocol, What I did was to think why protocol need to be designed like this and what are constraints that compel designers to build protocols like this. However, when I was doing my own design or doing abstraction of networking protocol, it was so exacting that sometimes I would be stuck in one tiny syntax design or logic design for one or two day. 1) The first reason is the essence of design - I am doing something that nobody has done before. Therefore, everything I encountered was brand new. It was common that when I was designing a new syntax, there were two ways to accomplish. However, these two ways result in different consequences, which, since nobody encountered before and there is no ready formula or criterion to assess my choice, are very hard to choose. 2) The second reason is specific in designing new syntax - expressiveness. When I was designing my grammar, I always fell into dillima that I was oscillating between abstarction and flexibility, since the more abstraction I give to users, like syntax candy or high-level syntax to abstract meaningfull events or actions, the more users are suffered from the constraints of these high-level grammar and, on the contrary, the more flexibility I give to users, like bit-level manipulation, data structure manipulation or pointer operations, the less likely I design a new grammar - what I designed is more like a encapsulation of C language.