Reference

Pack into and Unpack from Bytes

The whole reason of using bisturi is to parse some binary payload into a handy Python object or the other way around, from Python to a binary payload.

Parsing is know as unpack and generating the binary payload is known as pack. If you are familiar with Python’s struct library you may recognize the names: we are following the same convention.

Unpack: from bytes to Python

First, we create a simple packet class as we did before.

>>> from bisturi.packet import Packet
>>> from bisturi.field import Int, Data

>>> class TLP(Packet):
...    type = Int(1)
...    length = Int()
...    payload = Data(length)

Now, let be the following string of bytes:

>>> s1 = b'\x02\x00\x00\x00\x03abc'

Can you see what should be the value of type or payload?

I hope!. If not, let the packet dissect the string for you calling unpack:

>>> p = TLP.unpack(s1)
>>> p.type
2
>>> p.length
3
>>> p.payload
b'abc'

unpack optionally receives the offset from where to start reading.

In the following example we would like to ignore the first 3 bytes of s2 and parse the rest as a TLP.

>>> s2 = b'xxx\x01\x00\x00\x00\x01d'

Yes, we could do TLP.unpack(s2[3:]) but that makes a copy – Python is inflexible here – and a copy can be somewhat expensive.

Using offset= is better:

>>> q = TLP.unpack(s2, offset=3)   #ignore the first 3 bytes "xxx"
>>> q.type
1
>>> q.length
1
>>> q.payload
b'd'

[tl;dr] Field types

Now, if you follow the Python’s rules you may ask: “are the fields class attributes or instance attributes?”

They are instance attributes so, as you will expect, two packets may have different values for the same field.

>>> p.type
2
>>> q.type
1

Both are different. Under the hood, those object’s attributes are optimized for use low memory and the packets don’t have a __dict__ instance.

>>> hasattr(p, '__dict__')
False

Pack: from Python to bytes

The reverse of unpack() is, obviously, pack(): it converts a packet into a sequence of bytes:

>>> p.pack()
b'\x02\x00\x00\x00\x03abc'

Ignoring some special cases (not covered here) it is safe to assume that a packet is packed into the same sequence of bytes from it was unpacked:

>>> p.pack() == s1
True
>>> q.pack() == s2[3:]
True

Error handling

Life is never easy and the things not always work as expected: as any process, pack() and unpack() may fail.

If a field cannot be unpacked, an exception is raised with the full stack of packets and offsets:

>>> def some_function(raw):
...    q = TLP.unpack(raw)

>>> s = b'\x00\x00\x00\x00\x04a'

>>> some_function(s)                # byexample: +norm-ws
Traceback (most recent call last):
<...>PacketError: Error when unpacking the field 'payload'
of packet TLP at 00000005: Unpacked 1 bytes but expected 4
Packet stack details:
    00000005 TLP                            .payload
Field's exception:
<...>
Exception: Unpacked 1 bytes but expected 4<...>

The exception is telling us that when bisturi tried to unpack the field payload of the TLP packet it failed.

It was able to unpack 1 byte but expected to unpack 4.

A similar error could happen when packing.

Python is dynamic and bisturi does not enforce any type constrain on the fields attributes.

But when the packet is converted to bytes bisturi will make sure that every field is converted and if something fails it will raise an exception.

>>> p = TLP()
>>> p.length = "a non integer!"

>>> p.pack()                        # byexample: +norm-ws
Traceback (most recent call last):
<...>PacketError: Error when packing the field 'between 'type' and 'length''
of packet TLP at 00000000: <...> argument <...> integer
Packet stack details:
    00000000 TLP                            .between 'type' and 'length'
Field's exception:
<...>

Basically you cannot put apples and expect bisturi to make sense of them. It has not such magic built-in.

bisturi is capable of packing/unpacking invalid data but that and more about debugging and errors are for some advanced lecture.

[extra] Working with files

No always you will have the full string in memory to parse but you will have a file instead.

SeekableFile adapter will make the file behave as a string so bisturi can use it.

>>> from bisturi.util import SeekableFile

>>> seekable_file = SeekableFile(open('tests/ds/tlp_abc', 'rb'))

>>> p = TLP.unpack(seekable_file)
>>> p.length
3
>>> p.payload
b'abc'