List of (automatic) protocol reverse engineering tools/methods/approaches for network protocols
This is a collection of 69 scientific papers about (automatic) protocol reverse engineering (PRE) methods and tools. The papers are categorized into different groups so that it is more easy to get an overview of existing solutions based on the problem you want to tackle.
The collection is based on the following three surveys and got extended afterwards:
- J. Narayan, S. K. Shukla, and T. C. Clancy, “A Survey of Automatic Protocol Reverse Engineering Tools,” ACM Computing Surveys, vol. 48, no. 3, pp. 1–26, Feb. 2016, doi: 10.1145/2840724. PDF
- J. Duchêne, C. Le Guernic, E. Alata, V. Nicomette, and M. Kaâniche, “State of the art of network protocol reverse engineering tools,” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 1, pp. 53–68, Feb. 2018, doi: 10.1007/s11416-016-0289-8. PDF
- B. D. Sija, Y.-H. Goo, K.-S. Shim, H. Hasanova, and M.-S. Kim, “A Survey of Automatic Protocol Reverse Engineering Approaches, Methods, and Tools on the Inputs and Outputs View,” Security and Communication Networks, vol. 2018, pp. 1–17, 2018, doi: 10.1155/2018/8370341. PDF
Furthermore, there is a very extensive surveys which focuses on the methods and approaches of PRE tools that are based on network traces. The work of Kleber et al. is an excellent starting point to see what was already tried and for which use cases a method is working best.
- S. Kleber, L. Maile, and F. Kargl, “Survey of Protocol Reverse Engineering Algorithms: Decomposition of Tools for Static Traffic Analysis,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 526–561, 2019, doi: 10.1109/COMST.2018.2867544. PDF
Please help extending this collection by adding papers to the tools.ods
.
Overview ↑
Name | Year | Approach used |
---|---|---|
PIP [1] | 2004 | Keyword detection and Sequence alignment based on Needleman and Wunsch 1970 and Smith and Waterman 1981; this approach was applied and extended by many following papers |
GAPA [2] | 2005 | Protocol analyzer and open language that uses the protocol analyzer specification Spec → it is meant to be integrated in monitoring and analyzing tools |
ScriptGen [3] | 2005 | Grouping and clustering messages, find edges from clusters to clusters for being able to replay messages once a similar message arrives |
RolePlayer [4] | 2006 | Byte-wise sequence alignment (find variable fields in messages) and clustering with FSM simplification |
Ma et al. [5] | 2006 | Please review |
FFE/x86 [6] | 2006 | Please review |
Replayer [7] | 2006 | Please review |
Discoverer [8] | 2007 | Tokenization of messages, recursive clustering to find formats, merge similar formats |
Polyglot [9] | 2007 | Dynamic taint-analysis |
PEXT [10] | 2007 | Message clustering for creating FSM graph and simplify FSM graph |
Rosetta [11] | 2007 | Please review |
AutoFormat [12] | 2008 | Dynamic taint-analysis |
Tupni [13] | 2008 | Dynamic taint-analysis; look for loops to identify boundaries within messages |
Boosting [14] | 2008 | Please review |
ConfigRE [15] | 2008 | Please review |
ReFormat [16] | 2009 | Dynamic taint-analysis, especially targeting encrypted protocols by looking for bitwise and arithmetic operations |
Prospex [17] | 2009 | Dynamic taint-analysis with following message clustering, optionally provides fuzzing candidates for Peach fuzzer |
Xiao et al. [18] | 2009 | Please review |
Trifilo et al. [19] | 2009 | Measure byte-wise variances in aligned messages |
Antunes and Neves [20] | 2009 | Please review |
Dispatcher [21] | 2009 | Dynamic taint-analysis (successor of Polyglot using send instead of received messages) |
Fuzzgrind [22] | 2009 | Please review |
REWARDS [23] | 2010 | Please review |
MACE [24] | 2010 | Please review |
Whalen et al. [25] | 2010 | Please review |
AutoFuzz [26] | 2010 | Please review |
ReverX [27] | 2011 | Speech recognition (thus only for text-based protocols) to find carriage returns and spaces, afterwards looking for frequencies of keywords; multiple partial FSMs are merged and simplified to get PFSM |
Veritas [28] | 2011 | Identifiying keywords, clustering and transition probability → probabilistic protocol state machine |
Biprominer [29] | 2011 | Statistical analysis including three phases, learning phase, labeling phase and transition probability model building phase. See this figure. |
ASAP [30] | 2011 | Please review |
Howard [31] | 2011 | Please review |
ProDecoder [32] | 2012 | Successor of Biprominer which also addresses text-based protocols; two-phases are used: first apply Biprominer, second use Needleman-Wunsch for alignment |
Zhang et al. [33] | 2012 | Please review |
Netzob [34] | 2012 | See this figure |
PRISMA [35] | 2012 | Please review, follow-up paper/project to ASAP |
ARTISTE [36] | 2012 | Please review |
Wang et al. [37] | 2013 | Capturing of data, identifying frames and inferring the format by looking and frequency of frames and doing association analysis (using Apriori and FP-Growth). |
Laroche et al. [38] | 2013 | Please review |
AutoReEngine [39] | 2013 | Apriori Algorithm (based on Agrawal/Srikant 1994). Identify fields and keywords by considering the amount of occurrences. Message formats are considered as series of keywords. State machines are derived from labeled messages or frequent subsequences. See this figure for clarification. |
Dispatcher2 [40] | 2013 | Please review |
ProVeX [41] | 2013 | Identify Botnet traffic and try to infer the botnet type by using signatures |
Meng et al. [42] | 2014 | Please review |
AFL [43] | 2014 | Please review |
Proword [44] | 2014 | Please review |
ProGraph [45] | 2015 | Please review |
FieldHunter [46] | 2015 | Please review |
RS Cluster [47] | 2015 | Please review |
UPCSS [48] | 2015 | Please review |
ARGOS [49] | 2015 | Please review |
PULSAR [50] | 2015 | Reverse engineer network protocols with the aim to fuzz them with thus knowledge |
Li et al. [51] | 2015 | Please review |
Cai et al. [52] | 2016 | Please review |
WASp [53] | 2016 | Pcap files are provided with context information (i.e. known MAC address), then grouping and analysing (looking for CRC, N-gram, Entropy, Features, Ranges), afterwards report creation based on scoring. |
PRE-Bin [54] | 2016 | Please review |
Xiao et al. [55] | 2016 | Please review |
PowerShell [56] | 2017 | Please review |
ProPrint [57] | 2017 | Please review |
ProHacker [58] | 2017 | Please review |
Esoul and Walkinshaw [59] | 2017 | Please review |
PREUGI [60] | 2017 | Please review |
NEMESYS [61] | 2018 | Please review |
Goo et al. [62] | 2019 | Apriori based: Finding „frequent contiguous common subsequences“ via new Contiguous Sequential Pattern (CSP) algorithm which is based on Generalized Sequential Pattern (GSP) and other Apriori algorithms. CSP is used three times hierarchically to extract different information/fields based on previous results. |
Universal Radio Hacker [63] | 2019 | Physical layer based analysis of proprietary wireless protocols considering wireless specific properties like Received Signal Strength Indicator (RSSI) and using statistical methods |
Luo et al. [64] | 2019 | From abstract: “[…] this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message.” |
Sun et al. [65] | 2019 | Please review |
Yang et al. [66] | 2020 | Using deep-learning (LSTM-FCN) for reversing binary protocols |
Sun et al. [67] | 2020 | "To measure format similarity of unknown protocol messages in a proper granularity, we propose relative measurements, Token Format Distance (TFD) and Message Format Distance (MFD), based on core rules of Augmented Backus-Naur Form (ABND)." for clustering process Silhouette Coefficient and Dunn Index are used. density based cluster algorithm DBSCAN is used for clustering of messages |
Shim et al. [68] | 2020 | Follow up on Goo et al. 2019 |
NEMETYL. [69] | 2020 | Follow up on Stephan Kleber et al or NEMESYS. 2018 |
IPART [70] | 2020 | Using extended voting expert algorithm to infer boundaries of fields, otherwise using three phase which are tokenizing, classifying and clustering. |
GrAMeFFSI [71] | 2020 | Using GrAMeFFSI, a method based on graph analysis that can infer protocol message formats as well as certain field semantics for binary protocols from network traces. |
NetPlier [72] | 2021 | Build an end-to-end system NETPLIER, which stands for “Probabilistic NETwork ProtocoL Reverse EngIneERing”. It takes network traces as input and produces the final message format. |
Input and Output ↑
NetT: input is a network trace (e.g. pcap)
ExeT: input is an execution trace (code/binary at hand)
PF: output is protocol format (describing the syntax)
PFSM: output is protocol finite state machine (describing semantic/sequential logic)
Name | Year | NetT | ExeT | PF | PFSM | Other Output |
---|---|---|---|---|---|---|
PIP [1] | 2004 | ✔ | Keywords/ fields | |||
GAPA [2] | 2005 | ✔ | ✔ | ✔ | ||
ScriptGen [3] | 2005 | ✔ | Dialogs/scripts (for replaying) | |||
RolePlayer [4] | 2006 | ✔ | Dialogs/scripts | |||
Ma et al. [5] | 2006 | ✔ | App-identification | |||
FFE/x86 [6] | 2006 | ✔ | ||||
Replayer [7] | 2006 | ✔ | ||||
Discoverer [8] | 2007 | ✔ | ✔ | |||
Polyglot [9] | 2007 | ✔ | ✔ | |||
PEXT [10] | 2007 | ✔ | ✔ | |||
Rosetta [11] | 2007 | ✔ | ||||
AutoFormat [12] | 2008 | ✔ | ✔ | |||
Tupni [13] | 2008 | ✔ | ✔ | |||
Boosting [14] | 2008 | ✔ | Field(s) | |||
ConfigRE [15] | 2008 | ✔ | ||||
ReFormat [16] | 2009 | ✔ | ✔ | |||
Prospex [17] | 2009 | ✔ | ✔ | ✔ | ✔ | |
Xiao et al. [18] | 2009 | ✔ | ✔ | |||
Trifilo et al. [19] | 2009 | ✔ | ✔ | |||
Antunes and Neves [20] | 2009 | ✔ | ✔ | |||
Dispatcher [21] | 2009 | ✔ | C&C malware | |||
Fuzzgrind [22] | 2009 | ✔ | ||||
REWARDS [23] | 2010 | ✔ | ||||
MACE [24] | 2010 | ✔ | ||||
Whalen et al. [25] | 2010 | ✔ | ✔ | |||
AutoFuzz [26] | 2010 | ✔ | ✔ | ✔ | ||
ReverX [27] | 2011 | ✔ | ✔ | ✔ | ||
Veritas [28] | 2011 | ✔ | ✔ | |||
Biprominer [29] | 2011 | ✔ | ✔ | ✔ | ||
ASAP [30] | 2011 | ✔ | Semantics | |||
Howard [31] | 2011 | ✔ | ||||
ProDecoder [32] | 2012 | ✔ | ✔ | |||
Zhang et al. [33] | 2012 | ✔ | ✔ | |||
Netzob [34] | 2012 | ✔ | ✔ | ✔ | ✔ | |
PRISMA [35] | 2012 | ✔ | ||||
ARTISTE [36] | 2012 | ✔ | ||||
Wang et al. [37] | 2013 | ✔ | ✔ | |||
Laroche et al. [38] | 2013 | ✔ | ✔ | |||
AutoReEngine [39] | 2013 | ✔ | ✔ | ✔ | ||
Dispatcher2 [40] | 2013 | ✔ | C&C malware | |||
ProVeX [41] | 2013 | ✔ | Signatures | |||
Meng et al. [42] | 2014 | ✔ | ✔ | |||
AFL [43] | 2014 | ✔ | ||||
Proword [44] | 2014 | |||||
ProGraph [45] | 2015 | ✔ | ✔ | |||
FieldHunter [46] | 2015 | ✔ | Fields | |||
RS Cluster [47] | 2015 | ✔ | Grouped-messages | |||
UPCSS [48] | 2015 | ✔ | Proto-classification | |||
ARGOS [49] | 2015 | ✔ | ||||
PULSAR [50] | 2015 | |||||
Li et al. [51] | 2015 | ✔ | ✔ | |||
Cai et al. [52] | 2016 | ✔ | ✔ | |||
WASp [53] | 2016 | ✔ | ✔ | scored analysis reports, spoofing candidates | ||
PRE-Bin [54] | 2016 | ✔ | ✔ | |||
Xiao et al. [55] | 2016 | ✔ | ✔ | |||
PowerShell [56] | 2017 | ✔ | Dialogs/scripts | |||
ProPrint [57] | 2017 | ✔ | Fingerprints | |||
ProHacker [58] | 2017 | ✔ | Keywords | |||
Esoul and Walkinshaw [59] | 2017 | |||||
PREUGI [60] | 2017 | ✔ | ✔ | |||
NEMESYS [61] | 2018 | ✔ | ✔ | |||
Goo et al. [62] | 2019 | ✔ | ✔ | ✔ | ||
Universal Radio Hacker [63] | 2019 | ✔ | ✔ | |||
Luo et al. [64] | 2019 | |||||
Sun et al. [65] | 2019 | |||||
Yang et al. [66] | 2020 | ✔ | ✔ | |||
Sun et al. [67] | 2020 | |||||
Shim et al. [68] | 2020 | ✔ | ✔ | |||
NEMETYL [69] | 2020 | ✔ | ✔ | |||
IPART [70] | 2020 | ✔ | ✔ | |||
GrAMeFFSI [71] | 2020 | ✔ | ✔ | |||
NetPlier [72] | 2021 | ✔ | ✔ |
Tested protocols ↑
Name | Year | Text-based | Binary-based | Hybrid | Other Protocols |
---|---|---|---|---|---|
PIP [1] | 2004 | HTTP | |||
GAPA [2] | 2005 | HTTP | |||
ScriptGen [3] | 2005 | HTTP | NetBIOS | DCE | |
RolePlayer [4] | 2006 | HTTP, FTP, SMTP, NFS, TFTP | DNS, BitTorrent, QQ, NetBios | SMB, CIFS | |
Ma et al. [5] | 2006 | HTTP, FTP, SMTP, HTTPS (TCP-Protos) | DNS, NetBIOS, SrvLoc (UDP-Protos) | ||
FFE/x86 [6] | 2006 | ||||
Replayer [7] | 2006 | ||||
Discoverer [8] | 2007 | HTTP | RPC | SMB, CIFS | |
Polyglot [9] | 2007 | HTTP, Samba, ICQ | DNS, IRC | ||
PEXT [10] | 2007 | FTP | |||
Rosetta [11] | 2007 | ||||
AutoFormat [12] | 2008 | HTTP, SIP | DHCP, RIP, OSPF | SMB, CIFS | |
Tupni [13] | 2008 | HTTP, FTP | RPC, DNS, TFTP | WMF, BMP, JPG, PNG, TIF | |
Boosting [14] | 2008 | DNS | |||
ConfigRE [15] | 2008 | ||||
ReFormat [16] | 2009 | HTTP, MIME | IRC | One unknown protocol | |
Prospex [17] | 2009 | SMTP, SIP | SMB | Agobot (C&C) | |
Xiao et al. [18] | 2009 | HTTP, FTP, SMTP | |||
Trifilo et al. [19] | 2009 | TCP, DHCP, ARP, KAD | |||
Antunes and Neves [20] | 2009 | FTP | |||
Dispatcher [21] | 2009 | HTTP, FTP, ICQ | DNS | ||
Fuzzgrind [22] | 2009 | ||||
REWARDS [23] | 2010 | ||||
MACE [24] | 2010 | ||||
Whalen et al. [25] | 2010 | ||||
AutoFuzz [26] | 2010 | ||||
ReverX [27] | 2011 | FTP | |||
Veritas [28] | 2011 | SMTP | PPLIVE, XUNLEI | ||
Biprominer [29] | 2011 | XUNLEI, QQLive, SopCast | |||
ASAP [30] | 2011 | HTTP, FTP, IRC, TFTP | |||
Howard [31] | 2011 | ||||
ProDecoder [32] | 2012 | SMTP, SIP | SMB | ||
Zhang et al. [33] | 2012 | HTTP, SNMP, ISAKMP | |||
Netzob [34] | 2012 | FTP, Samba | SMB | Unknown P2P & VoIP protocol | |
PRISMA [35] | 2012 | ||||
ARTISTE [36] | 2012 | ||||
Wang et al. [37] | 2013 | ICMP | ARP | ||
Laroche et al. [38] | 2013 | FTP | DHCP | ||
AutoReEngine [39] | 2013 | HTTP, FTP, SMTP, POP3 | DNS, NetBIOS | ||
Dispatcher2 [40] | 2013 | HTTP, FTP, ICQ | DNS | SMB | |
ProVeX [41] | 2013 | HTTP, SMTP, IMAP | DNS, VoIP, XMPP | Malware Family Protocols | |
Meng et al. [42] | 2014 | TCP, ARP | |||
AFL [43] | 2014 | ||||
Proword [44] | 2014 | ||||
ProGraph [45] | 2015 | HTTP | DNS, BitTorrent, WeChat | ||
FieldHunter [46] | 2015 | MSNP | DNS | SopCast, Ramnit | |
RS Cluster [47] | 2015 | FTP, SMTP, POP3, HTTPS | DNS, XunLei, BitTorrent, BitSpirit, QQ, eMule | MSSQL, Kugoo, PPTV | |
UPCSS [48] | 2015 | HTTP, FTP, SMTP, POP3, IMAP | DNS, SSL, SSH | SMB | |
ARGOS [49] | 2015 | ||||
PULSAR [50] | 2015 | ||||
Li et al. [51] | 2015 | ||||
Cai et al. [52] | 2016 | HTTP, SSDP | DNS, BitTorrent, QQ, NetBios | ||
WASp [53] | 2016 | IEEE 802.15.4 proprietary protocols, Smart plug & PSD systems | |||
PRE-Bin [54] | 2016 | ||||
Xiao et al. [55] | 2016 | ||||
PowerShell [56] | 2017 | ARP, OSPF, DHCP, STP | CDP/DTP/VTP, HSRP, LLDP, LLMNR, mDNS, NBNS, VRRP | ||
ProPrint [57] | 2017 | ||||
ProHacker [58] | 2017 | ||||
Esoul and Walkinshaw [59] | 2017 | ||||
PREUGI [60] | 2017 | ||||
NEMESYS [61] | 2018 | ||||
Goo et al. [62] | 2019 | HTTP | DNS | ||
Universal Radio Hacker [63] | 2019 | proprietary wireless protocols of IoT devices | |||
Luo et al. [64] | 2019 | ||||
Sun et al. [65] | 2019 | ||||
Yang et al. [66] | 2020 | IPv4, TCP | |||
Sun et al. [67] | 2020 | ||||
Shim et al. [68] | 2020 | FTP | Modbus/TCP, Ethernet/IP | ||
NEMETYL [69] | 2020 | ||||
IPART [70] | 2020 | Modbus, IEC104, Ethernet/IP | |||
GrAMeFFSI [71] | 2020 | Modbus, MQTT | |||
NetPlier [72] | 2021 | FTP | DHCP, DNP3, ICMP, Modbus, NTP, TFTP | SMB, SMB2 | ZeroAccess |
Source Code ↑
Most papers do not provide the code used in the research. For the following papers exists (example) code.
Name | Year | Source Code |
---|---|---|
PIP [1] | 2004 | https://web.archive.org/web/20090416234849/http://4tphi.net/~awalters/PI/PI.html |
ReverX [27] | 2011 | https://github.com/jasantunes/reverx |
Netzob [34] | 2012 | https://github.com/netzob/netzob |
PRISMA [35] | 2012 | https://github.com/tammok/PRISMA/ |
PULSAR [50] | 2015 | https://github.com/hgascon/pulsar |
NEMESYS [61] | 2018 | https://github.com/vs-uulm/nemesys |
Universal Radio Hacker [63] | 2019 | https://github.com/jopohl/urh |
NetPlier [72] | 2021 | https://github.com/netplier-tool/NetPlier |
References ↑
M. Beddoe, “The protocol informatics project,” 2004, http://www.4tphi.net/∼awalters/PI/PI.html. PDF
N. Borisov, D. J. Brumley, H. J. Wang, J. Dunagan, P. Joshi, and C. Guo, “Generic application-level protocol analyzer and its language,” MSR Technical Report MSR-TR-2005-133, 2005. PDF
C. Leita, K. Mermoud, and M. Dacier, “ScriptGen: an automated script generation tool for Honeyd,” in Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC ’05), pp. 203–214, Tucson, Ariz, USA, December 2005. PDF
W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz, “Protocolindependent adaptive replay of application dialog,” in Proceedings of the 13th Symposium on Network and Distributed System Security (NDSS ’06), 2006. PDF
J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G. Voelker, “Automatic protocol inference: unexpected means of identifying protocols,” UCSD Computer Science Technical Report CS2006-0850, 2006. PDF
Lim, J., Reps, T., Liblit, B.: Extracting output formats from executables. In: 13th Working Conference on Reverse Engineering, 2006. WCRE ’06, pp. 167–178. IEEE, Benevento (2006). doi:10.1109/WCRE.2006.29 PDF
Cui, W., Paxson, V., Weaver, N., Katz, R.H.: Protocol-independent adaptive replay of application dialog. In: Proceedings of the 13th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2006). http://research.microsoft.com/apps/pubs/default.aspx?id=153197
W. Cui, J. Kannan, and H. J. Wang, “Discoverer: Automatic protocol reverse engineering from network traces.,” in USENIX security symposium, 2007, pp. 1–14. PDF
J. Caballero, H. Yin, Z. Liang, and D. Song, “Polyglot: automatic extraction of protocol message format using dynamic binary analysis,” in Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS ’07), pp. 317–329, ACM, November 2007. PDF
M. Shevertalov and S. Mancoridis, “A reverse engineering tool for extracting protocols of networked applications,” in Proceedings of the 14th Working Conference on Reverse Engineering (WCRE ’07), pp. 229–238, October 2007. PDF
Caballero, J., Song, D.: Rosetta: Extracting Protocol Semantics Using Binary Analysis with Applications to Protocol Replay and NAT Rewriting. Technical Report CMU-CyLab-07-014, Carnegie Mellon University, Pittsburgh (2007)PDF
Z. Lin, X. Jiang, D. Xu, and X. Zhang, “Automatic protocol format reverse engineering through context-aware monitored execution,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), February 2008.PDF
W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irun-Briz, “Tupni: automatic reverse engineering of input formats,” in Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS ’08), pp. 391–402, ACM, Alexandria, Va, USA, October 2008. PDF
K. Gopalratnam, S. Basu, J. Dunagan, and H. J. Wang, “Automatically extracting fields from unknown network protocols,” in Proceedings of the 15th Symposium on Network and Distributed System Security (NDSS ’08), 2008. PDF
Wang, R., Wang, X., Zhang, K., Li, Z.: Towards automatic reverse engineering of software security configurations. In: Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08, pp. 245–256. ACM, Limerick (2008). doi:10.1145/1455770.1455802.PDF
Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, “ReFormat: automatic reverse engineering of encrypted messages,” in Computer Security—ESORICS 2009. ESORICS 2009, M. Backes and P. Ning, Eds., vol. 5789 of Lecture Notes in Computer Science, pp. 200–215, Springer, Berlin, Germany, 2009.PDF
P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda, “Prospex: protocol specification extraction,” in Proceedings of the 30th IEEE Symposium on Security and Privacy, pp. 110–125, Berkeley, Calif, USA, May 2009. PDF
M.-M. Xiao, S.-Z. Yu, and Y. Wang, “Automatic network protocol automaton extraction,” in Proceedings of the 3rd International Conference on Network and System Security (NSS ’09), pp. 336–343, October 2009.PDF
A. Trifilo, S. Burschka, and E. Biersack, “Traffic to protocol reverse engineering,” in Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–8, July 2009.PDF
J. Antunes and N. Neves, “Building an automaton towards reverse protocol engineering,” 2009, PDF.
J. Caballero, P. Poosankam, C. Kreibich, and D. Song, “Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering,” in Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS ’09), pp. 621–634, ACM, Chicago, Ill, USA, November 2009. PDF
Campana, G.: Fuzzgrind: an automatic fuzzing tool. In: Hack. lu. Hack. lu, Luxembourg (2009).PDF
Lin, Z., Zhang, X., Xu, D.: Automatic reverse engineering of data structures from binary execution. In: Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2010).PDF
Cho, C.Y., Babi D., Shin, E.C.R., Song, D.: Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp. 426–439. ACM, New York, NY (2010). doi:10.1145/1866307.1866355.PDF Cho, C.Y., Babi, D., Poosankam, P., Chen, K.Z., Wu, E.X., Song, D.: MACE: model-inference-assisted concolic exploration for protocol and vulnerability discovery. In: Proceedings of the 20th USENIX Conference on Security, SEC’11, p. 19. USENIX Association, Berkeley, CA (2011).PDF
S. Whalen, M. Bishop, and J. P. Crutchfield, “Hidden Markov Models for Automated Protocol Learning,” in Security and Privacy in Communication Networks, vol. 50, S. Jajodia and J. Zhou, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 415–428. PDF
S. Gorbunov and A. Rosenbloom, “Autofuzz: Automated network protocol fuzzing framework,” IJCSNS, vol. 10, no. 8, p. 239, 2010. PDF
J. Antunes, N. Neves, and P. Verissimo, “Reverse engineering of protocols from network traces,” in Proceedings of the 18th Working Conference on Reverse Engineering (WCRE ’11), pp. 169–178, October 2011. PDF
Y. Wang, Z. Zhang, D. D. Yao, B. Qu, and L. Guo, “Inferring protocol state machine from network traces: a probabilistic approach,” in Proceedings of the 9th Applied Cryptography and Network Security International Conference (ACNS ’11), pp. 1–18, 2011.PDF
Y. Wang, X. Li, J. Meng, Y. Zhao, Z. Zhang, and L. Guo, “Biprominer: automatic mining of binary protocol features,” in Proceedings of the 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT ’11), pp. 179–184, October 2011.PDF
T. Krueger, N. Krmer, and K. Rieck, “Asap: automatic semantics-aware analysis of network payloads,” in Proceedings of the ECML/PKDD, 2011. PDF
Slowinska, A., Stancescu, T., Bos, H.: Howard: a dynamic excavator for reverse engineering data structures. In: Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS). Internet Society, San Diego (2011).PDF
Y. Wang, X. Yun, M. Z. Shafiq et al., “A semantics aware approach to automated reverse engineering unknown protocols,” in Proceedings of the 20th IEEE International Conference on Network Protocols (ICNP ’12), pp. 1–10, IEEE, Austin, Tex, USA, November 2012. PDF
Z. Zhang, Q.-Y. Wen, and W. Tang, “Mining protocol state machines by interactive grammar inference,” in Proceedings of the 2012 3rd International Conference on Digital Manufacturing and Automation (ICDMA ’12), pp. 524–527, August 2012.PDF
G. Bossert, F. Guihéry, and G. Hiet, “Towards automated protocol reverse engineering using semantic information,” in Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, Kyoto, Japan, June 2014.PDF G. Bossert and F. Guihéry, “Reverse and simulate your enemy botnet C&C,” in Proceedings of the Mapping a P2P Botnet with Netzob, Black Hat 2012, Abu Dhabi, UAE, December 2012. PDF
Krueger, T., Gascon, H., Krmer, N., Rieck, K.: Learning stateful models for network honeypots. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence, AISec ’12, pp. 37–48. ACM, New York, NY (2012). doi:10.1145/2381896.2381904.PDF
Caballero, J., Grieco, G., Marron, M., Lin, Z., Urbina, D.: ARTISTE: Automatic Generation of Hybrid Data Structure Signatures from Binary Code Executions. Technical Report TR-IMDEA-SW-2012-001, IMDEA Software Institute, Madrid (2012).PDF
Y. Wang, N. Zhang, Y.-M. Wu, B.-B. Su, and Y.-J. Liao, “Protocol formats reverse engineering based on association rules in wireless environment,” in Proceedings of the 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom ’13), pp. 134–141, Melbourne, Australia, July 2013.PDF
P. Laroche, A. Burrows, and A. N. Zincir-Heywood, “How far an evolutionary approach can go for protocol state analysis and discovery,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC ’13), pp. 3228–3235, June 2013.PDF
J.-Z. Luo and S.-Z. Yu, “Position-based automatic reverse engineering of network protocols,” Journal of Network and Computer Applications, vol. 36, no. 3, pp. 1070–1077, 2013.PDF
J. Caballero and D. Song, “Automatic protocol reverse-engineering: message format extraction and field semantics inference,” Computer Networks, vol. 57, no. 2, pp. 451–474, 2013. PDF
C. Rossow and C. J. Dietrich, “PROVEX: detecting botnets with encrypted command and control channels,” in Detection of Intrusions and Malware, and Vulnerability Assessment, Springer, 2013. PDF
F. Meng, Y. Liu, C. Zhang, T. Li, and Y. Yue, “Inferring protocol state machine for binary communication protocol,” in Proceedings of the IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA ’14), pp. 870–874, September 2014.PDF
Zalewski, M.: American Fuzzy Loop. http://lcamtuf.coredump.cx/afl/technical_details.txt
Z. Zhang, Z. Zhang, P. P. C. Lee, Y. Liu, and G. Xie, “ProWord: An unsupervised approach to protocol feature word extraction,” in IEEE INFOCOM 2014 - IEEE Conference on Computer Communications, Toronto, ON, Canada, Apr. 2014, pp. 1393–1401, doi: 10.1109/INFOCOM.2014.6848073. PDF
Q. Huang, P. P. C. Lee, and Z. Zhang, “Exploiting intrapacket dependency for fine-grained protocol format inference,” in Proceedings of the 14th IFIP Networking Conference (NETWORKING ’15), Toulouse, France, May 2015.PDF
I. Bermudez, A. Tongaonkar, M. Iliofotou, M. Mellia, and M. M. Munafo, “Automatic protocol field inference for deeper protocol understanding,” in Proceedings of the 14th IFIP Networking Conference (Networking ’15), pp. 1–9, May 2015. PDF
J.-Z. Luo, S.-Z. Yu, and J. Cai, “Capturing uncertainty information and categorical characteristics for network payload grouping in protocol reverse engineering,” Mathematical Problems in Engineering, vol. 2015, Article ID 962974, 9 pages, 2015.PDF
R. Lin, O. Li, Q. Li, and Y. Liu, “Unknown network protocol classification method based on semi supervised learning,” in Proceedings of the IEEE International Conference on Computer and Communications (ICCC ’15), pp. 300–308, Chengdu, China, October 2015.PDF
Zeng, J., Lin, Z.: Towards automatic inference of kernel object semantics from binary code. In: 18th International Symposium, RAID 2015, vol. 9404, pp. 538–561. Springer, Kyoto (2015). doi:10.1007/978-3-319-26362-5.PDFhtml
H. Gascon, C. Wressnegger, F. Yamaguchi, D. Arp, and K. Rieck, “Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols,” in Security and Privacy in Communication Networks, vol. 164, B. Thuraisingham, X. Wang, and V. Yegneswaran, Eds. Cham: Springer International Publishing, 2015, pp. 330–347. PDF
H. Li, B. Shuai, J. Wang, and C. Tang, “Protocol Reverse Engineering Using LDA and Association Analysis,” in 2015 11th International Conference on Computational Intelligence and Security (CIS), Shenzhen, China, Dec. 2015, pp. 312–316, doi: 10.1109/CIS.2015.83.PDF
J. Cai, J. Luo, and F. Lei, “Analyzing network protocols of application layer using hidden Semi-Markov model,” Mathematical Problems in Engineering, vol. 2016, Article ID 9161723, 14 pages, 2016. PDF
K. Choi, Y. Son, J. Noh, H. Shin, J. Choi, and Y. Kim, “Dissecting customized protocols: automatic analysis for customized protocols based on IEEE 802.15.4,” in Proceedings of the 9th ACM Conference on Security and Privacy in Wireless and Mobile Networks, pp. 183–193, Darmstadt, Germany, July 2016. PDF
S. Tao, H. Yu, and Q. Li, “Bit‐oriented format extraction approach for automatic binary protocol reverse engineering,” IET Communications, vol. 10, no. 6, pp. 709–716, Apr. 2016, doi: 10.1049/iet-com.2015.0797. PDF
M.-M. Xiao, S.-L. Zhang, and Y.-P. Luo, “Automatic network protocol message format analysis,” IFS, vol. 31, no. 4, pp. 2271–2279, Sep. 2016, doi: 10.3233/JIFS-169067.PDF
D. R. Fletcher Jr., Identifying Vulnerable Network Protocols with PowerShell, SANS Institute Reading Room site, 2017.PDF
Y. Wang, X. Yun, Y. Zhang, L. Chen, and G. Wu, “A nonparametric approach to the automated protocol fingerprint inference,” Journal of Network and Computer Applications, vol. 99, pp. 1–9, 2017.html
Y. Wang, X. Yun, Y. Zhang, L. Chen, and T. Zang, “Rethinking robust and accurate application protocol identification,” Computer Networks, vol. 129, pp. 64–78, 2017.PDF
O. Esoul and N. Walkinshaw, “Using Segment-Based Alignment to Extract Packet Structures from Network Traces,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), Prague, Czech Republic, Jul. 2017, pp. 398–409, doi: 10.1109/QRS.2017.49. PDF
M.-M. Xiao and Y.-P. Luo, “Automatic protocol reverse engineering using grammatical inference,” IFS, vol. 32, no. 5, pp. 3585–3594, Apr. 2017, doi: 10.3233/JIFS-169294.PDF
S. Kleber, H. Kopp, and F. Kargl, “{NEMESYS}: Network message syntax reverse engineering by analysis of the intrinsic structure of individual messages,” 2018. PDF
Y.-H. Goo, K.-S. Shim, M.-S. Lee, and M.-S. Kim, “Protocol Specification Extraction Based on Contiguous Sequential Pattern Algorithm,” IEEE Access, vol. 7, pp. 36057–36074, 2019, doi: 10.1109/ACCESS.2019.2905353. PDF
J. Pohl and A. Noack, “Universal radio hacker: A suite for analyzing and attacking stateful wireless protocols,” Baltimore, MD, Aug. 2018, [Online]. Available: https://www.usenix.org/conference/woot18/presentation/pohl. J. Pohl and A. Noack, “Automatic wireless protocol reverse engineering,” Santa Clara, CA, Aug. 2019, [Online]. Available: https://www.usenix.org/conference/woot19/presentation/pohl. PDF
X. Luo, D. Chen, Y. Wang, and P. Xie, “A Type-Aware Approach to Message Clustering for Protocol Reverse Engineering,” Sensors, vol. 19, no. 3, p. 716, Feb. 2019, doi: 10.3390/s19030716. PDF
F. Sun, S. Wang, C. Zhang, and H. Zhang, “Unsupervised field segmentation of unknown protocol messages,” Computer Communications, vol. 146, pp. 121–130, Oct. 2019, doi: 10.1016/j.comcom.2019.06.013.html
C. Yang, C. Fu, Y. Qian, Y. Hong, G. Feng, and L. Han, “Deep Learning-Based Reverse Method of Binary Protocol,” in Security and Privacy in Digital Economy, vol. 1268, S. Yu, P. Mueller, and J. Qian, Eds. Singapore: Springer Singapore, 2020, pp. 606–624.PDF
F. Sun, S. Wang, C. Zhang, and H. Zhang, “Clustering of unknown protocol messages based on format comparison,” Computer Networks, vol. 179, p. 107296, Oct. 2020, doi: 10.1016/j.comnet.2020.107296.html
K. Shim, Y. Goo, M. Lee, and M. Kim, “Clustering method in protocol reverse engineering for industrial protocols,” International Journal of Network Management, Jun. 2020, doi: 10.1002/nem.2126. PDF
Stephan Kleber, Rens Wouter van der Heijden, Frank Kargl, “Message Type Identification of Binary Network Protocols using Continuous Segment Similarity. PDF
X. Wang, K. Lv, and B. Li, “IPART: an automatic protocol reverse engineering tool based on global voting expert for industrial protocols,” International Journal of Parallel, Emergent and Distributed Systems, vol. 35, no. 3, pp. 376–395, May 2020, doi: 10.1080/17445760.2019.1655740. PDF
Ládi, Gergő and Buttyán, Levente and Holczer, Tamás (2020) GrAMeFFSI: Graph Analysis Based Message Format and Field Semantics Inference For Binary Protocols, Using Recorded Network Traffic. INFOCOMMUNICATIONS JOURNAL, 12 (2). pp. 25-33. ISSN 2061-2079. PDF
Yapeng Ye, Zhuo Zhang, Fei Wang, Xiangyu Zhang, Dongyan Xu (Purdue University) NetPlier: Probabilistic Network Protocol Reverse Engineering from Message Traces. PDF