A simple approach to native network marshalling

jwatte's picture

I used to do serialization using all kinds of fancy templates and macros. You can create pretty elegant systems that way. However, at some point, simplicity should win out. Here's a system that might work just fine for you:

A simple packet class, which really is all you need:

class packet {
public:
  packet() : pos_(0) {}
  void append(void const *data, size_t size) {
    data_.insert(data_.end(), (char const *)data, (char const *)data + size);
  }
  void read(void *data, size_t size) {
    if (size > data_.size() - pos_) throw std::invalid_argument("bad size");
    memcpy(data, &data_[pos_], size);
    pos_ += size;
  }
  void seek(size_t pos) {
    if (pos > data_.size()) throw std::invalid_argument("bad pos");
    pos_ = pos;
  }
  size_t size() const { return data_.size(); }
  void const *data() const { return data_.size() ? &data[0] : 0; }
private:
  std::vector<char> data_;
  size_t pos_;
};

You probably want something simple to deal with big-endian and little-endian data:

struct net16 {
  unsigned char data_[2];
  operator int() const { return ((int)data_[0] << 8) | (int)data_[1]; }
  net16 &operator=(int i) { data_[0] = i&0xff; data_[1] = (i>>8)&0xff; }
  template<typename P> void write(P &p) {
    p.write(data_, 2);
  }
  template<typename P> void read(P &p) {
    p.read(data_, 2);
  }
};
 
struct netString : public std::string {
  template<typename P> void write(P &p) {
    net16 len = size();
    len.write(p);
    p.write(c_str(), size());
  }
  template<typename P> void read(P &p) {
    net16 len;
    len.read(p);
    resize(len);
    p.read(&(*this)[0], len);
  }
};
 
template<typename T> struct netVector : public std::vector<T> {
  template<typename P> void read(P &p) {
    net16 len;
    len.read(p);
    resize(len);
    for (iterator i(begin()), n(end()); i != n; ++i) {
      (*i).read(p);
    }
  }
  template<typename P> void write(P &p) {
    net16 len = size();
    len.write(p);
    for (iterator i(begin()), n(end()); i != n; ++i) {
      (*i).write(p);
    }
  }
};

Here, I just say that anything you want to serialize has a "read(P)" and "write(P)" function, and that those functions will call read(data, size) and write(data, size) on the argument (and/or delegate to other members that in turn do that).

Finally, define your messages:

struct ChatMessage {
  net16 channel_id;
  netString message;
  template<typename P> void write(P &p) {
    channel_id.write(p);
    message.write(p);
  }
  template<typename P> void read(P &p) {
    channel_id.read(p);
    message.read(p);
  }
};
 
struct ItemMessage {
  netVector<Item> items;
  template<typename P> void write(P &p) {
    items.write(p);
  }
  template<typename P> void read(P &p) {
    items.read(p);
  }
};

Note that I'm assuming that you know what the packet is through some data that comes before the packet. And, if you're on TCP, I'm assuming you know how big the data is, again through some data before the packet. A typical such "framing header" might look like:

struct FramingHeader {
  net16 type;
  net16 size;
  template<typename P> void write(P &p) { type.write(p); size.write(p); }
  template<typename P> void read(P &p) { type.read(p); size.read(p); }
};

Simple usage:

  ChatMessage cm = ...;
  ItemMessage im = ...;
 
  packet p, q, data;
  FramingHeader hdr;
  cm.write(p);
  hdr.type = TYPE_CHAT_MESSAGE;
  hdr.size = p.size();
  hdr.write(q);
  data.write(q.data(), q.size());
  data.write(p.data(), p.size());
 
  q.clear();
  p.clear();
  im.write(p);
  hdr.type = TYPE_ITEM_MESSAGE;
  hdr.size = p.size();
  hdr.write(q);
  data.write(q.data(), q.size());
  data.write(p.data(), p.size());
 
  my_socket.write_or_queue_data(data);

I'll stop now before I write an entire message sending and receiving/dispatching system here, but if you build it up like this, it should be pretty straightforward. On the socket data side, you want to pack all the outgoing data into one big vector, and at a regular interval (each frame, 10 times a second, or whatever) you want to enqueue all the available data. Even if you use UDP, that's how you do it; combining many messages into a single packet to cut down on overhead.

Same thing for incoming; when the socket is readable, you receive as much as you can into the end of some big array that has currently pending data. Then, if there's at least sizeof(FramingHeader) available, decode the framing header and check for how much data it needs. If there's that much additional data available, then decode it (based on the type in the header), remove the data from the incoming array, and repeat. Typically you'll want to use a cyclic buffer of some sort rather than vector::erase() to remove the consumed data, for performance reasons.