{"id":1458,"date":"2024-07-29T12:22:06","date_gmt":"2024-07-29T10:22:06","guid":{"rendered":"https:\/\/ai4cyber.eu\/?p=1458"},"modified":"2024-07-29T12:22:06","modified_gmt":"2024-07-29T10:22:06","slug":"hunting-down-vulnerabilities-with-ai-enhanced-symbolic-execution","status":"publish","type":"post","link":"https:\/\/ai4cyber.eu\/?p=1458","title":{"rendered":"Hunting down vulnerabilities with AI-enhanced Symbolic Execution"},"content":{"rendered":"<p><em><strong><span class=\"TextRun SCXW73065283 BCX8\" lang=\"EN-GB\" xml:lang=\"EN-GB\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW73065283 BCX8\">Author:<\/span> <\/span><span class=\"TextRun SCXW73065283 BCX8\" lang=\"EN-GB\" xml:lang=\"EN-GB\" data-contrast=\"auto\"><span class=\"NormalTextRun SCXW73065283 BCX8\">Edit Peng\u0151, Istv\u00e1n Siket<\/span><\/span><span class=\"EOP SCXW73065283 BCX8\" data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:260}\">\u00a0<\/span><\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">In the ever-evolving world of software development, ensuring that your code is robust, secure, and free from bugs is paramount. Traditional testing methods, like unit tests and integration tests, are essential but have limitations. Enter symbolic execution \u2013 a powerful technique that can take your software testing to the next level. In the following post, we\u2019ll break down symbolic execution into easy-to-understand concepts, exploring what it is, and how it can be enhanced with AI technologies.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><em><strong>What is Symbolic Execution?\u00a0<\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">Imagine you\u2019re testing a piece of software. Normally, you\u2019d run it with specific inputs to see how it behaves. Symbolic execution, however, takes a different approach. Instead of using actual inputs, it uses symbols to represent a wide range of possible inputs. This allows it to explore multiple execution paths of the program in one go, rather than one path at a time.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><em><strong>How Does Symbolic Execution Work?\u00a0<\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">The symbolic execution engine runs the program, substituting the symbolic inputs wherever the actual inputs would normally be used. As the program executes, it explores different paths based on the conditions it encounters. This is called <\/span><b><span data-contrast=\"auto\">path exploration<\/span><\/b><span data-contrast=\"auto\">. For each path the program takes, the symbolic execution engine records the conditions (constraints) that must be true for that path to be taken. For instance, if the program has an <\/span><span data-contrast=\"auto\">if<\/span><span data-contrast=\"auto\"> statement, it will record the condition under which the <\/span><span data-contrast=\"auto\">if<\/span><span data-contrast=\"auto\"> branch is taken and the condition under which the else branch is taken. These constraints form the so-called <\/span><b><span data-contrast=\"auto\">path condition<\/span><\/b><span data-contrast=\"auto\">. Symbolic engines use a constraint solver to determine if there are actual inputs that satisfy the conditions for each path. This helps identify which paths are feasible and which ones are not.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><em><strong>Key benefits of Symbolic Execution\u00a0<\/strong><\/em><\/p>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"21\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Comprehensive Coverage<\/span><\/b><span data-contrast=\"auto\">: Traditional testing often misses edge cases and rarely exercises paths in the code. Symbolic execution systematically explores these paths, ensuring a more thorough testing.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"21\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Bug Discovery<\/span><\/b><span data-contrast=\"auto\">: By exploring a wide range of inputs and paths, symbolic execution can uncover bugs that might be missed by other testing methods. This includes finding security vulnerabilities, such as buffer overflows and integer overflows.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"21\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Automated Test Generation<\/span><\/b><span data-contrast=\"auto\">: Generating test cases manually can be time-consuming and error-prone. Symbolic execution automates this process, producing test cases that cover a wide range of scenarios.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><em><strong>SourceMeter<\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">SourceMeter is a static analyzer toolchain developed by FrontEndART Software Ltd. It analyzes C\/C++, Java, C#, Python, and JavaScript projects. It calculates source code metrics, detects code clones, and finds coding rule violations in the source code. One of its components is a Symbolic Execution Engine for Java. This executor is designed to detect runtime exceptions in Java source code without executing the application in a real-life environment. Currently, it can detect four kinds of common failures:<\/span> <span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"19\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">NullPointerException<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"19\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">ArrayIndexOutOfBoundsException<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"19\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"3\" data-aria-level=\"1\"><span data-contrast=\"auto\">NegativeArraySizeException<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"19\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"4\" data-aria-level=\"1\"><span data-contrast=\"auto\">DivideByZeroException<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><span data-contrast=\"auto\">The symbolic execution is called for each method in the program separately. For big systems, this approach is usually a better solution than only starting the execution from the main() method.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><em><strong>Challenges and Limitations\u00a0<\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">While symbolic execution is powerful, it\u2019s not without challenges. The number of possible paths grows exponentially with the number of branches (<\/span><b><span data-contrast=\"auto\">path explosion<\/span><\/b><span data-contrast=\"auto\">), making it difficult to explore all paths. Moreover, solving complex constraints can be computationally expensive and time-consuming. It is not surprising that symbolic execution suffers from <\/span><b><span data-contrast=\"auto\">scalability issues<\/span><\/b><span data-contrast=\"auto\"> on larger systems. For this reason, there are numerous heuristics and methods to improve the scalability of symbolic execution. In the following section , we discuss the possibilities of combining AI and symbolic execution.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><em><strong>AI-enhanced Symbolic Execution\u00a0<\/strong><\/em><\/p>\n<p><span data-contrast=\"auto\">By leveraging AI&#8217;s predictive capabilities, learning algorithms, and optimization techniques, we can overcome the aforementioned challenges. Potential uses:<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"20\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Reinforcement Learning (RL)<\/span><\/b><span data-contrast=\"auto\">: Reinforcement learning can be used to guide the path exploration process. By learning which paths are more likely to lead to bugs or critical code sections, an RL agent can prioritize these paths, making the symbolic execution process more efficient.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"20\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Optimizing Constraint Solving<\/span><\/b><span data-contrast=\"auto\">: Machine learning models can predict the complexity of constraints and choose the most appropriate solving strategies. For instance, a model can decide whether to use a lightweight heuristic-based solver or a more powerful but resource-intensive solver based on the characteristics of the constraints.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:120,&quot;335559739&quot;:120,&quot;335559740&quot;:259}\">\u00a0<\/span><\/li>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"20\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Neural Networks<\/span><\/b><span data-contrast=\"auto\">: Neural networks can learn from previous symbolic execution runs to predict which execution paths are likely to reveal new bugs. This can optimize the path exploration process.<\/span><\/li>\n<\/ul>\n<p><em><strong>Conclusion<\/strong><\/em><\/p>\n<p>We have experimented in the past to improve the path selection in SourceMeter&#8217;s symbolic executor. We preferred paths that had more null value checks, as we assumed that these paths would have more operations that could result in a NullPointerException. We want to further improve the path selection algorithm using AI.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Edit Peng\u0151, Istv\u00e1n Siket\u00a0 In the ever-evolving world of software development, ensuring that your code is robust, secure, and free from bugs is paramount. Traditional testing methods, like unit tests and integration tests, are essential but have limitations. Enter symbolic execution \u2013 a powerful technique that can take your software testing to the next [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[11],"tags":[],"class_list":["post-1458","post","type-post","status-publish","format-standard","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/posts\/1458","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1458"}],"version-history":[{"count":1,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/posts\/1458\/revisions"}],"predecessor-version":[{"id":1459,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=\/wp\/v2\/posts\/1458\/revisions\/1459"}],"wp:attachment":[{"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1458"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1458"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ai4cyber.eu\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1458"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}