Oct 2, 2023
Adrian Culley
Trellix
Postponed (holiday)
Nov 6, 2023
Achiya Elyasaf
Ben-Gurion University,
Israel
Postponed to April 2024
Dec 11, 2023
Tom Yaacov
Ben-Gurion University,
Israel
Adding Liveness to Executable
Specifications
Abstract:
One of the benefits of using executable specifications such as Behavioral Programming (BP) is the ability to align the system code with its requirements. This is facilitated by a protocol that allows modules, each representing a requirement, to specify what the system may, must, and must not do. However, this approach allows only the enforcement of safety properties and does not support the execution of liveness properties, which describe desirable outcomes. To address this, we propose idioms for tagging states with "must-finish" to directly and independently model liveness requirements in BP, indicating that specific tasks are yet to be completed. We offer semantics and two execution mechanisms, one based on translation to a Generalized Büchi Automaton (GBA) game and the other based on deep reinforcement learning (DRL). We include a formal analysis of the proposed mechanisms and quantitative assessments using a publicly available tool we developed.
Feb 5, 2024
Diptikalyan Saha
IBM Research, India
Automated Black-box API Testing
Abstract: The proliferation of web-based applications has increased the need to test web services. With the adoption of web service standards such as REST (Representational State Transfer) and SOAP (Simple Object Access Protocol), it has become easier for developers to build and consume APIs. The testing of REST APIs has been a topic of interest in the recent past. However, these studies mostly focus on the testing of REST APIs to find bugs in the system under test. Functional testing involves testing the functional behavior of the system under test. In this work, we aim to automatically generate realistic functional test cases which can even be used for regression. Functional testing seeks to cover valid functional scenarios, the notion we concretely define and present an algorithm to generate nominal/valid test cases following a functional sequence of operations. We used a resource-based grouping strategy, a novel producer-consumer dependency inference algorithm, and a language model-based sequencing algorithm to generate an operational sequence suitable for functional test cases.
In this talk, I will present a birds-eye view of several AI Testing work at IBM Research. I will then introduce the problem of Black-box API Testing, and discuss some of the pioneering work in the area, followed by our tool called Autonomous API Tester.
April 1, 2024
Achiya Elyasaf
Ben-Gurion University,
Israel
Automatic Code Generation
Abstract:
In this talk, I will present three different works on automatic code generation.
"Evolving Assembly Code in an Adversarial Environment" (https://arxiv.org/abs/2403.19489):
The study evolves assembly code for the CodeGuru competition, aiming to create a survivor program that runs longest in shared memory. Evolved programs discover weaknesses in opponent programs and exploit them.
"From Requirements to Source Code: Evolution of Behavioral Programs" (https://doi.org/10.3390/app12031587):
Genetic programming (GP) combined with behavioral programming (BP) is explored. BP models programs as sets of behavioral threads aligned with system requirements, facilitating effective program generation using GP.
"From Requirements to Source Code for Real: " (ongoing work):
Large language models (LLMs) vastly improved automated code generation, though they struggle with complex behavior, especially involving cross-cutting aspects. Bridging the gap between system requirements and code requires an intermediate design phase. advocates for utilizing LLMs to translate requirements into behavioral programming (BP) specifications. This approach facilitates direct alignment between requirements and BP modules, enabling the generation of cohesive behavior that is aligned with all requirements and eliminating the need for a separate design phase.
May 6, 2024
IBM Research, Haifa
A technical concept map of the foundations
of AI
Postponed
June 3, 2024
Michael Brand
Otzma Analytics
Avoiding the Pitfalls of Data Science: Lessons from Three Reviews
Recording
July 1, 2024
Eitan Farchi
IBM Research, Haifa
Sept 9, 2024
Michael Brand
Otzma Analytics
A Theory of Error Intolerant Estimation
Abstract:
Point estimation is a fundamental statistical task. Given the wide selection of available point estimators, it is unclear, however, what, if any, would be universally-agreed theoretical reasons to generally prefer one such estimator over another. In this talk, we define a class of estimation scenarios which includes commonly-encountered problem situations such as both "high stakes" estimation and scientific inference, and introduce a new class of estimators, Error Intolerance Candidates (EIC) estimators, which we prove is optimal for it.
EIC estimators are parameterised by an externally-given loss function. We prove, however, that even without such a loss function if one accepts a small number of incontrovertible-seeming assumptions regarding what constitutes a reasonable loss function, the optimal EIC estimator can be characterised uniquely.
January 6, 2025
Eitan Farchi
IBM Research, Haifa
Automatic Generation of Benchmarks and Reliable LLM
Judgment for Code Tasks