Schedule

2023-2024

Speakers and presentations

Meeting

Owner

Affiliation

Title

Oct 2, 2023

Adrian Culley

Trellix

Postponed (holiday)

Nov 6, 2023

Achiya Elyasaf

Ben-Gurion University,

Israel

Postponed to April 2024

Dec 11, 2023

Tom Yaacov

Ben-Gurion University,

Israel

Adding Liveness to Executable

Specifications

Abstract:

One of the benefits of using executable specifications such as Behavioral Programming (BP) is the ability to align the system code with its requirements. This is facilitated by a protocol that allows modules, each representing a requirement, to specify what the system may, must, and must not do. However, this approach allows only the enforcement of safety properties and does not support the execution of liveness properties, which describe desirable outcomes. To address this, we propose idioms for tagging states with "must-finish" to directly and independently model liveness requirements in BP, indicating that specific tasks are yet to be completed. We offer semantics and two execution mechanisms, one based on translation to a Generalized Büchi Automaton (GBA) game and the other based on deep reinforcement learning (DRL). We include a formal analysis of the proposed mechanisms and quantitative assessments using a publicly available tool we developed.

Jan 8, 2024

Ora Nova Fandina

IBM Research, Haifa

Postponed to May 2024

Feb 5, 2024

Diptikalyan Saha

IBM Research, India

Automated Black-box API Testing

Abstract: The proliferation of web-based applications has increased the need to test web services. With the adoption of web service standards such as REST (Representational State Transfer) and SOAP (Simple Object Access Protocol), it has become easier for developers to build and consume APIs. The testing of REST APIs has been a topic of interest in the recent past. However, these studies mostly focus on the testing of REST APIs to find bugs in the system under test. Functional testing involves testing the functional behavior of the system under test. In this work, we aim to automatically generate realistic functional test cases which can even be used for regression. Functional testing seeks to cover valid functional scenarios, the notion we concretely define and present an algorithm to generate nominal/valid test cases following a functional sequence of operations. We used a resource-based grouping strategy, a novel producer-consumer dependency inference algorithm, and a language model-based sequencing algorithm to generate an operational sequence suitable for functional test cases.

In this talk, I will present a birds-eye view of several AI Testing work at IBM Research. I will then introduce the problem of Black-box API Testing, and discuss some of the pioneering work in the area, followed by our tool called Autonomous API Tester.

Recording

March 4, 2024

Jeremy S. Bradbury

Ontario Tech University

TBD

April 1, 2024

Achiya Elyasaf

Ben-Gurion University,

Israel

Automatic Code Generation

Abstract:

In this talk, I will present three different works on automatic code generation.

"Evolving Assembly Code in an Adversarial Environment" (https://arxiv.org/abs/2403.19489):

The study evolves assembly code for the CodeGuru competition, aiming to create a survivor program that runs longest in shared memory. Evolved programs discover weaknesses in opponent programs and exploit them.

"From Requirements to Source Code: Evolution of Behavioral Programs" (https://doi.org/10.3390/app12031587):

Genetic programming (GP) combined with behavioral programming (BP) is explored. BP models programs as sets of behavioral threads aligned with system requirements, facilitating effective program generation using GP.

"From Requirements to Source Code for Real: " (ongoing work):

Large language models (LLMs) vastly improved automated code generation, though they struggle with complex behavior, especially involving cross-cutting aspects. Bridging the gap between system requirements and code requires an intermediate design phase. advocates for utilizing LLMs to translate requirements into behavioral programming (BP) specifications. This approach facilitates direct alignment between requirements and BP modules, enabling the generation of cohesive behavior that is aligned with all requirements and eliminating the need for a separate design phase.

Recording

May 6, 2024

Ora Nova Fandina

IBM Research, Haifa

A technical concept map of the foundations

of AI

Postponed

June 3, 2024

Michael Brand

Otzma Analytics

Avoiding the Pitfalls of Data Science: Lessons from Three Reviews
Recording

July 1, 2024

Eitan Farchi

IBM Research, Haifa

Using Combinatorial Optimization to Design a High quality

LLM Solution of AI

Sept 9, 2024

Michael Brand

Otzma Analytics

A Theory of Error Intolerant Estimation

Abstract:

Point estimation is a fundamental statistical task. Given the wide selection of available point estimators, it is unclear, however, what, if any, would be universally-agreed theoretical reasons to generally prefer one such estimator over another. In this talk, we define a class of estimation scenarios which includes commonly-encountered problem situations such as both "high stakes" estimation and scientific inference, and introduce a new class of estimators, Error Intolerance Candidates (EIC) estimators, which we prove is optimal for it.

EIC estimators are parameterised by an externally-given loss function. We prove, however, that even without such a loss function if one accepts a small number of incontrovertible-seeming assumptions regarding what constitutes a reasonable loss function, the optimal EIC estimator can be characterised uniquely.

January 6, 2025

Eitan Farchi

IBM Research, Haifa

Automatic Generation of Benchmarks and Reliable LLM

Judgment for Code Tasks

https://sites.google.com/view/ai-and-sq/home

Google Sites

Report abuse