Understanding industrial network traffic, dissectors and Lua and Kaitai
Analysis of traffic of industrial protocols
Analysis of traffic or frames is a highly useful technique to verify safety in communications and in the protocols involved. The main purpose is to obtain as much information and knowledge as possible for communications, but what would happen if the analyser used is not equipped with the means to interpret protocol fields?
In this industry, we often have to deal with protocols but we do not always have an analyser for their frames, either because they are proprietary frames or because even if they are standard ones, they are not widely used. These protocols shall require an extra effort when analysing them in order to create a new dissector (plugin that allows the main programme to decompose a data package following a certain set of characteristics) which allows them to be automatically processed by the main network analysing tools.
However, not all protocols can be interpreted this easily. The first classification can be done in accordance with the existence or absence of specifications thereof. As discussed above, on the one hand, the protocols that are not widely used, even if the specification is available, may not be supported, while on the other hand the proprietary protocols do not have any type of information.
The second classification is performed due to the complexity of the protocols. The logical division is done between simple protocols, in which each frame may be analysed independently, as they contain all the information and complex ones, in which a frame depends on the information of previous frames and requires the re-assembling or interpretation (decompression or deciphering) of certain fields.
Most old industrial protocols are oriented towards the message; therefore, they could be classified in the group of simple ones. These protocols often use a serial cable (mainly under the RS-485 standards) as a physical medium. Transformation of these protocols to the Ethernet world increased their complexity; although it is true that many of them are still message-oriented, the new protocols are usually more complex and can usually be classified in the second group. As it has been discussed many times before, the number of proprietary-type industrial protocols is quite a lot higher than that existing in corporate environments, and this also makes it difficult to create dissectors, as there is no public information available in that respect.
Tool to create new dissectors: LUA and Kaitai
Traffic analysis tools are based on specific dissectors which allow protocols to be understood and information fields contained by the said protocols' frames to be delimited. Usually, users may add new dissectors which allow for incorporation of protocols that were not initially supported.
Thus, the Wireshark tool has several pre-defined dissectors, including a high number for industrial protocols, including Modbus/TCP, DNP3, IEC104, etc. Also, should it be necessary, it offers the possibility to include new dissectors based on scripts written in LUA.
However, Wireshark is not the only analysis tool, but it is one of the most widely used and known, and therefore developing an exclusive script for this tool may be counter-productive. For this reason, other alternatives have been created, such as Kaitai Struct, which enable users to define parsers for binary structures and, to later export them to different languages for their integration into a number of applications.
Steps to create a dissector
In order to carry out a proper analysis of the protocol which can be later translated into an adequate dissector, a series of steps must be followed, in order to obtain as much information as possible.
Information regarding protocols and fields that define it is the first thing that must be obtained. If the protocol is open, there will be a technical specification with all details. These specifications are usually payment documents, but the cost may be offset with the value obtained from analysis, in the case that it is a company that carries out activities related to safety, or for other reasons the developer may have.
If this specification is not available, information from alternative sources must be sought, such as technical forums, presentations, etc., where the format of the frame may have been analysed to a certain extent, as well as some of the fields of which it is comprised. This information is usually quite generic, not very detailed and deals with general aspects of the protocol; for instance, all these functions will hardly be described.
All this information is relevant when defining an adequate interpretation of the protocol and the fields that define it.
Collect frames of the protocol to be dissected
If the technical specification is available or all the information of the protocol has been gathered, this stage of gathering frames shall not be necessary; otherwise, having actual protocol traffic frames available will be the only way to complete our knowledge of such protocol.
The aim of this stage shall be to have as many captured packages of the protocol to be dissected, trying to have said captured traffic delimited and known in order to carry out a proper classification of the same. It would be ideal to have captures of all the types of frames existing within the protocol, but this is not always possible.
In order to gather frames, any traffic capturing tool may be used, and the capture being contaminated by other protocols' frames must be avoided. In order to do so, equipment involved in the communication to be captured has to be isolated. Said capture may be performed by means of a mirror or span port of a switch or router, or using a specific hardware called network tap.
Identify general common parts
When the dissector is started, it is important to identify the common parts between all the frames and the not common ones, so that a first identification of the frames may be performed, so as to separate them from the rest of protocols.
These first fields to be identified are:
- Communications port: Defines the typical port or ports through which the communication is performed (TCP or UDP). This port may be fixed and be reserved for the protocol (such as 502/TCP for the Modbus/TCP protocol).
- Header of the industrial protocol: The size of the protocol's header must be identified; it shall not be confused with the header of the Ethernet package and the fields it includes. Some of the most typical fields are:
- Identification of the protocol: usually, the first bytes of the frame.
- Identification of station or connection node: especially in client/server type protocols.
- Size of the data package: Identify the length of the protocol's own data.
- Data: Defined from the size of the data hosted in the package header, this part of the frame is usually the most different one and it is hard to find common parts. It may occur that within the data field there is a final part that comprises a CRC/hash to verify data integrity.
Identify specific common parts
Once the frame's general structure has been defined, it is time to identify the common parts between the different frames. The easiest option would be to separate the requests from answers first, and try to see common parts within each group. Some of the fields or characteristics that may be looked for are:
- Identification of actions/function: Industrial protocols are usually based on an action code that defines the action to be carried out. This field must be identified, as well as all potential codes for said field, as this will allow us to define the following field.
- Data field: This will usually include the variables involved in the action and the value. Here we must identify the way to define the variables, which may be:
- list all variables,
- list start and end,
- mark the start and state the amount.
Values come next, and they will depend on the number of variables; they may have however certain specific characteristics, such as defining a number of bytes determined per value, for instance.
Once all the information that defines the protocol is available, we only need to write it in an adequate language, according to the traffic analyser to be used.
Example of dissector
Modbus/TCP, the most commonly used communication protocol, is going to be taken as an example for the creation of a dissector. Even though most analysers already include a dissector for this protocol, it is valid as example.
Modbus is an open protocol, as indicated in the study Protocols and network security in ICS Infrastructure.
The Modbus frame is completely defined in the specification, and as such it can directly start to framework the dissector without performing a frame analysis.
In this example, Kaitai will be used as a basis to carry out the definition of the protocol frameworking and later, through the kaitai-to-wireshark script it is translated into Lua to include it in Wireshark.
Kaitai uses YAML language for the creation of binary or .ksy files. The file for the modbus protocol may be as shown below.
If we transfer the .ksy file to the conversion script, the following file is obtained in .lua format.
With it, and a few modifications, such as the inclusion of the port (in section <port> ready in the file) or certain revisions of types, the version to be used in Wireshark is obtained
The script resulting from the modifications must be included in Wireshark's plugins own path. In Wireshark's previous versions, it was required to modify a configuration file, but in the latest versions, with the simple inclusion in the Plugins directory, it is automatically loaded when the program is started.