Post date: Apr 22, 2021 1:38:2 PM
Leszek is a professor in computer science of University of Liverpool: https://www.liverpool.ac.uk/computer-science/staff/leszek-gasieniec/.
See also https://sites.google.com/site/davmarkup/say/leszekgasieniec for other, related discussion.
Sent: Thursday, April 22, 2021 6:37 am
Hi Nick,
As indicated in our conversation on LinkedIn we are a group of researchers (mostly in Algorithms) in the Department of Computer Science who would like to research more in widely understood area of Humane Algorithms. We are open to any ideas which would improve our understanding of the motivation, main themes, past work and challenges.
While searching the web I encountered your workshop on the topic. We would appreciate access to the relevant material. Also we have our group meetings on Friday's 2pm. Would you or one of your collaborators be willing to give a short presentation in this or related area followed by discussion? I look forward to hearing from you.
Best wishes, Leszek
Leszek,
Thanks for your email
I would be happy to come talk at one of your group meetings. Tomorrow is probably a bit short notice as I would need some time to prepare
Would next week be okay?
Regards
Nick
Hi Nick, next week would be perfect. We use MS Teams platform but if you prefer some other system we will adjust accordingly. Cheers, Leszek
Teams will be fine,
Nick
I noticed that you cc-ied Scott Ferson in earlier exchange of messages.
Do you think he (you can also bring more guests if you want) can join us on Friday?
Best wishes, Leszek
Yes I suspect that Scott as well as a couple of others from the risk institute will probably also want to attend if that is no problem
Hi Nick, absolutely, the greater crowd the better.
One of my colleagues will give a summary of our current thoughts and plans too.
I assume we will also have time for questions and discussion (sadly no tea/coffee and cookies this time).
I am looking forward to the event on Friday.
Best wishes, Leszek
No cookies! I’ve lived too long.
I look forward to meeting you and your team tomorrow, although I’ll have to run away by 3:30 or so.
Some thoughts on this general topic are in amber at https://sites.google.com/site/humanealgorithms/. At the time, we were thinking that there are three different issues that seem related:
Fairness & social justice (including privacy, accountability, and self-driving cars that don’t kill people of colour or pregnant women),
Friendliness & workability (Bill Kahan, recovering from errors, standards for less obnoxious interfaces), and
Uncertainty & risk awareness (to handle input imprecision, missing values, appreciation of tail risks, etc.).
Best regards,
Scott
Hi Scott et al,
many thanks for the confirmation and further information.
I suppose we will likely have an introductory meeting today.
Some of my colleagues have to leave around 3pm so likely
we will need another time to discuss the concept of, challenges in,
and possible solutions related to Humane Algorithms in greater depth,
but perhaps also other topics of our joint interest.
I look forward to meeting you and others at 2pm.
Best, Leszek
PS Just in case I resend the link to the MS Teams meeting
[As Nick could not display-screen on Teams, Alex Wimbush sent the link for a (more humane) Zoom meeting]
https://liverpool-ac-uk.zoom.us/j/93410098227?pwd=eW1HNVNSSDFQWnBYZTkrTTM5c1Rmdz09
A video recording of the meeting was made.
Sebastian Wild asked whether there are any examples of good software, or examples where the algorithm really gets it right.
Bill Kahan's joke that some engineering disaster that gets blamed on inadequately tested software always makes him ask where the adequately tested software is.
Maybe the whole of the computer revolution is the example. Insofar as computers have revolutionised everyone's life for the better, that software is the positive example.
Leszek Gąsieniec asked for the broad categories of these deficiencies of software.
Nick argues that the failures are very context-specific, and sometimes one would have a hard time thinking of every single way algorithms might fail. But that doesn't mean that there are not groupings into which the failures Nick lists will fall, or that we don't know the broad concerns we have and the variables we should be monitoring.
We should concern ourselves when algorithms fail, when algorithms confuse, and when algorithms annoy. And we broadly know what features we should be checking: flexibility and adaptability, self-awareness (to check basal assumptions, unit concordance, recognise use outside limits and misuse), protectiveness of human privacy, attention to data that can document (un)fairness and (in)equitability, etc.
Nick Gray suggested that it would be helpful for users to have access to verbose characterisations of results that would improve the interpretability and transparency of algorithm outputs. Alex Wimbush pointed out that you want the option of verbosity, but you don't want to get a lecture with every number displayed.
Leszek Gąsieniec wondered if there is guidance on where to start. Designers often try to create optimal designs, but maybe that is not a good idea. Where do you stop? Is it always worth spending a lot of computational effort to get the absolute best answer. How long are you willing to wait to get a better answer? How much are you willing to pay? Maybe we should be exploring improvable results, that might initially be good enough but can be improved with further calculation--perhaps at a premium price. When are you overoptimised?
If we want to protect endangered owls, the best way to make more owls is in an owl farm, but that sort of misses the point of protecting the natural environment, which we want to protect in its entirety.
The problem of misidentified goals is profound. If we want to minimise human deaths, for instance, the best thing to do is kill everyone now, because the more people we allow to be born, the more will eventually die. You have to be very careful what you wish for.
Engineers have an answer to the question of where you should stop. You're not really interested in the optimal answer; you're interested in the answer that is robustly optimal.
Risk analysts also have an answer. The answer is best when the uncertainty is matched. That is, your answer should not be more precise than can be justified given the uncertainties of the inputs on which it is based.
Prudence Wong pointed out that the algorithm would be called on to provide not just the answer, but a whole spectrum of answers.
On 16/05/2021 19:10, Ferson, Scott wrote:
Sebastian:
I meant to email you about our discussion on 7 May, which I found enjoyable.
First, I wanted to make the joke that that’s not how you’re supposed to use a green screen…. But you’ve probably heard that before. J
I’m not sure we mentioned our collaborative website on humane algorithms. We’d be more than interested to chat about any of these issues.
I feel I gave your question about privacy short shrift during the discussion. The main issue is that, for algorithms to be humane, such protections need to be deployed, not just theoretical. Like security in general, no workable strategy is perfect, but something is better than nothing.
I’ve been trying to convince my friends in hospital management that the breaches we’ve seen so far are just the result of clumsiness. We’ve not yet experienced major ruthless attacks which I am sure are coming eventually down the pike. I imagine you’ll agree that the GDPR and the safe harbour rules in the US are both too severe and insufficiently protective.
From what I understand of differential privacy, it seems fair to say that it may not be a complete solution for the issue. This is not my main area, but we worked in privacy a few years ago. Our approach deliberately avoided altering the data by introducing randomness. Our goal was to ensure privacy but not at the expense of the information present in the data. We explored generalisation as an anonymisation strategy. I put some text and slides on the Humane Algorithms website at https://sites.google.com/site/humanealgorithms/say/privacy.
Sorry for the delay in sending this email. You’ve probably already forgotten you even brought up the topic. The end of the semester was very distracting this week.
Best regards,
Scott
From: Ferson, Scott
Sent: 11 June 2021 3:21 AM
To: Wild, Sebastian <Sebastian.Wild@liverpool.ac.uk>
Cc: Gray, Nicholas [nickgray] <Nicholas.Gray@liverpool.ac.uk>
Subject: RE: projects in humane algorithms
Dear Sebastian:
Sorry it’s taken me so long to respond. End of semester, meetings get piled up.
video conferences are not always the most effective medium for that ...
They’re horrible, aren’t they? They only thing worse is listening to a live talk on Zoom. You can’t even pause or rewind them. To be fair, I sometimes want to fast forward people I am talking to in person.
Is there a way to see the slides in large?
Sure, the slides are attached at the bottom of the page as a PPTX file that you should be able to download. If you can’t read PPTX, let me know what format you’d prefer.
I would like to understand the notion of generalization.
Real values replaced by enclosing intervals. Integers replaced by sets of integers. We were doing research in statistical analysis of data that include intervals and sets, as opposed to just point values, and that’s why we got into that project. I googled “problems with differential privacy”.
The application clearly sounds related to the contexts in which differential privacy has found uses. In our discussion, I mostly mentioned it because it naturally contains and entails a quantification of privacy. Wikipedia links few deployment examples (https://en.wikipedia.org/wiki/Differential_privacy#Adoption_of_differential_privacy_in_real-world_applications); I've only heard about these in talks. I should add that I have only a casual understanding of differential privacy, and in particular how complete this area is. My impression is that these are still quite insular ideas, and they target specific scenarios where data sharing is wanted and deliberate, but should respect privacy constraints.
Yes, I think you’re right. It’s not a unified field at all. You can quantify privacy in certain senses, but it is remarkably hard to quantify a risk (probability) of re-identification.
A general observation that I made from our discussions so far is that I wanted to share is the difference in focus of our groups. As theoreticians, we seek the minimal or most general scenario in which an interesting problem arises, and try to find solutions that are as generally applicable as possible. For provable guarantees, this might be a must. My impression of the risk institute group is closer to an engineering mindset, where the focus is on contextualizing a problem and finding holistic solutions, potentially tailored specifically to the specifics of the task. For robust and short-term deployable solutions, this might be a must.
Clearly this is a challenge for finding common ground, but it makes a collaboration potentially very powerful, when either group can take inspiration from the other group's perspectives and skills.
If you ask me, the next step towards that goal would be to pick some concrete, small, ideally approachable example of research work in either group and present that (interactively) to the other group in considerable detail.
Yeah, of course I agree completely. I’m not an engineer myself, and my experience with them is also that they often bite off more than they can chew and then settle for half-answers, which can be pretty unsatisfying intellectually. However, finding that nice, tractable bit to focus on is something of an art, isn’t it. Sometimes I don’t know what it is until I stumble onto the solution.
Originally, I was a population biologist. Within that group, I was the mathematician/statistician. Always an outsider.
Oh! Maybe I have suggestion for a more bite-sized CS problem: a symbolic algebra simplifier to reduce the repetitions of uncertain variables in mathematical expressions. It is a problem in computer algebra and in the design of optimising compilers. Although it may not seem terribly glamourous, it absolutely could be a game changer for getting computers to do uncertainty analysis automatically, which would be enormously beneficial to society and individual humans. This is really a subproblem in the Puffin project, but one that’s been on the back burner for a while. It has plagued me for several years, and I even wrote some code to parse expressions, pattern match through them with “twiddling and dislocating” (see the attached “eccad janos” poster from an old computer algebra conference). But the problem was beyond my skills or at least my attention span. I employed some undergraduate and post-doctoral computer scientists to work on this problem without much progress, but I think it should be fairly easy for good student or a lecturer who can spare the time to think about it. I’ve always felt we just didn’t get the right people to work on this problem. It would bring a host of important messy problems into a realm where computers could solve them straightforwardly.
I finally wanted to share an article with you that I wish I had brought to the discussion last time. It is just about a piece of terminology, but one that I have been struggling with for a while now, and now have learned explicit terms for: the use of “algorithm” for both a mathematical recipe for computation and any obscure decision-making system, often with negative connotation.
So here is an attempt to clarify and separate the two notions of algorithms, and a pitch for why we should care to keep them separate:
https://cacm.acm.org/blogs/blog-cacm/251514-misnomer-and-malgorithm/fulltext
(It is a bit more verbose than necessary, but I otherwise like the article.)
Thanks for the link. Our original purpose in creating the Google Site webpage to develop testimony for Parliament was in part to rescue the word ‘algorithm’ which Robin K. Hill mentions is trending, and anyone can see is poised to be the patsy in these early days of the robot wars [insert emojis: smiley face; sorry not sorry]. I certainly agree with Hill that blaming the algorithm would be missing the point. On this topic, I’ve been surprised that no one has yet mentioned that ‘algorithm’ is an Arabic word. I think Numberphile says that some Christians during the Crusades refused to use zero because it came from the infidels.
But tech is not neutral as Hill argues; it embodies the non-neutrality of the coders, and the code designers, and the assemblers of the data sets that formed the quantitative information wielded inside the tech, and the non-neutrality of the society that created the context for the tech in the first place. That’s surely what our recent experiences with Black-run business and on-line reviews has taught us, isn’t it? I didn’t read her previous blog, but surely coders can and do invest human decision making into algorithms, and that can go well just as it can go bad.
It’s great that you find Hill’s post useful. Obviously it’s an important discussion and it needs to percolate across the professional community. Hill distinguishes the i-algorithm and the j-algorithm, but this may be pointless as we already distinguish an algorithm and its application context. It seems absurd to think an i-algorithm cannot be biased by definition, or in Hill’s case by wishful thinking. Anything can be misused. Likewise, one could obviously build the devil into the code itself. The Republicans in the United States used purpose-built gerrymandering algorithms to dilute likely Democratic voters. And their recently proposed voting laws intend to disenfranchise people of colour. There is nothing “objective” or “beautiful” about that algorithm, although I might agree it is “breathtaking”. But maybe we shouldn’t be focused on the blame game, but rather on how to ensure no blame-worthy behaviours emerge, or at least how to detect whether they could.
PS regarding the green screen:
I realize that in the good old days these might have had other uses, but with several colleagues from all over the world, we have come to the conclusion that their true value lies in hiding junk behind them :)
That’s the funniest thing I’ve heard all day!
Actually, noticing how taxing the filters are on the CPU and power consumption, I think my usage of it is making the world a bit greener in more than one way.
Indeed.
Dear Scott,
Thanks a lot for sending these points and thoughts; they provoked some interesting discussions in our group that you will certainly hear more about at some point.
I am the crazy old man shouting at the pigeons in the park.
I wanted to make a quick remark on the second topic, the humane programming languages (for engineers). I think Wolfram Mathematica / their Wolfram Language has come a long way towards this goal, and I think it already fulfills some of your wishes. For example, quantities with uncertainty is a first-class citizen, so you can do things like this out of the box:
(I'm just playing around there to give you a flavor of what it can look like.)
I am aware of these Mathematica features, and I certainly think they’re great. Similar features exist in Matlab, and a few other platforms and calculators. I wish they were less clumsy syntactically, and much more widely adopted and available. And they were a lot cheaper, and accessible in real programming languages so they could become pervasive across and through the algorithmic ecosystem. There are also libraries for Python that (clumsily) bring most of the functionality we need to that language; that’s why it’s surprising to me that no one is clamouring that computing languages naturally respect units and dimensions.
(It can do a bit more than simple interval arithmetic: https://reference.wolfram.com/language/ref/Around.html)
Yeah, it is what we call a centred-form interval. Great. These are a step toward that goal where “distributions and intervals are the numbers of the future”.
Of course, it has its own limitations; apart from being an expensive commercial product, Mathematica has always followed a rather inhumane “garbage-in-garbage-out” philosophy that greatly hampers the traceability of results at times (speaking from sour experience). Computations are also an order of magnitude slower than, say, Python. Nevertheless, I think many of the algorithmic challenges arising from this topic may at least partially have been addressed to the point that they find commercial application.
Absolutely. I was thinking the price (inaccessibility) was the big thing, but the gigo philosophy is perhaps just as serious. I’d actually be interested in your “sour” experience. My old employer accumulated a lot of such experiences with spreadsheets (see Panko’s work on their ergonomics). When we were developing the units software back in the 1990s, I could not find any studies that argued that overall error rates were actually lower with electronic computers than they were with human computers. Everybody assumes they are, but where is the empirical evidence and what are the details? Do you know of any? And on what scales should we even be comparing? How should you be comparing a partially automated computation process with a fully human one run by Octavia Spencer (i.e., folks like Johnson, Jackson and Vaughan at NASA Langley celebrated in the film Hidden Figures).
Nick's overleaf is not accessible for outsiders, unfortunately.
Ah, sorry. Of course. Nick tells me the paper is still in rough draft format, and he may be a bit chagrined about it. Maybe I can send some slides instead. See the attached. These have grown (metastasized?) from the first presentation about the topic a couple of years ago. Several are animated. There are lots of hidden slides, and there probably should be more of them hidden. Of course we would love to talk to you about any of this any time.
Best,
Sebastian
Sorry again for the tardiness of my replies. And their loquaciousness.
It is my practice to record exchanges like this on the public website. Is that okay with you? Of course I omit your email address, and I can omit identifying personal information and any text you assert copyright to, etc.
Best regards,
Scott
Algorithm speed is not the only algorithm virtue. It’s always been fast to get wrong answers. Algorithms also need to be fair and just, appropriately transparent and discoverable, and they have to be used correctly and with regard for the individual details of humans they affect. If an algorithm is not these things, the answer it gives is unusable and thus wrong, irrespective of other virtues it might have. It seems to me it’s preferable to insist on good answers, even if it means we have to wait a few more milliseconds or longer.
From: Ferson, Scott
Sent: 16 May 2021 2:27 PM
To: Gasieniec, Leszek <lechu@liverpool.ac.uk>; Wild, Sebastian <Sebastian.Wild@liverpool.ac.uk>; Potapov, Igor <potapov@liverpool.ac.uk>; Wong, Prudence <pwong@liverpool.ac.uk>; Spirakis, Paul <spirakis@liverpool.ac.uk>; Zamaraev, Viktor <Viktor.Zamaraev@liverpool.ac.uk>; Krysta, Piotr <pkrysta@liverpool.ac.uk>
Cc: Nick Gray <nick.g.gray@gmail.com>; Silva, Pedro <Pedro.Silva@liverpool.ac.uk>; Alex <alexanderpwimbush@gmail.com>
Subject: projects in humane algorithms
Dear Leszek,
Following the humane algorithms discussion on Friday before last, there are two broad project ideas that might pique your interests.
The first is what we call the smarter mobility project which envisions a distributed computational solution to the “elevator-stairs” problem for individual travellers that relies on a merger of psychometric research in risk communication and risk appetite psychology with a stochastic and risk-aware version solution of a generalised Dijkstra algorithm. The idea is to fix the troublesome Google Maps, wrest it from commercial interest, and interactively customise it for an individual’s needs and dispositions. It features local stochastic optimisation and a feedback loop to generate more data. The public-facing website at https://sites.google.com/site/smartermobilitynetwork/workshops is perhaps the best introduction to the project, which at this early stage is essentially vapourware. Attached are a “pitch” to would-be graduate students with some rather inartful narration, and a longer introduction I used for a talk. Neither is rich in detail about the technical challenges the project expects to encounter. The paper “Optimising cargo loading and ship scheduling in tidal areas” (Le Carrer et al. 2020. European Journal of Operational Research 280: 1082-1094) peeks at the maths issues.
The second grand idea is even more pretentious. For scientists and engineers, computing is a lot different than managing data systems or websites. I teach both Python and Matlab to undergraduate engineers, and they mostly hate it. Although engineers are problem solvers, their skills, expectations and needs are very different from those of traditional programmers and certainly different from those of computer scientists, who are essentially mathematicians from an engineer’s point of view. The tools and provisions in, well, all the programming languages I know of are really woefully inadequate for our purposes. There are three features we think useful languages should have. We want to define a new high-level computer programming language that considers units like “meters” or “newton meters per second” as integral and native parts of numerical values, automatically handles uncertainty propagation and sensitivity analysis for all calculations, and has extensive built-in facilities for tracking and linking code elements and structure to justifications, data provenance records, assumptions, prior and subsequent analyses. The ideas are explained in https://sites.google.com/site/davmarkup/computer-language and https://sites.google.com/site/humanealgorithms/say/otherhumans. Nick Gray has been working on automating uncertainty analysis (https://www.overleaf.com/project/6050778244f8a80cc16ef909), which is the intellectually difficult part of this project.
Last Friday we also talked about responding to a recent ESRC request for proposals on “inclusive ageing”. It seems the deadline for a proposal is too soon, so a submission is unlikely. See the notes at https://sites.google.com/site/humanealgorithms/say/ageinginclusiveageing.
We’d be happy to talk to you about these or other possible projects any time.
Best regards,
Scott
Dear Scott,
I am sorry for the late response. Last month or so overwhelmed me mostly due to the end of the academic year and my heavy involvement in coordinating MSc projects (we have 250+ of them this year, a jump by 500% comparing to two years ago, plus the pandemic factor). This is on the top of other (including eternal) duties which coincided at the same time (perhaps bad luck, poor planning or a bit of both).
I understand that you received Sebastian's response in relation to Wolfram Mathematica language. Sadly we are not research active in the area of programming languages, but we really enjoyed reading and discussing your paper on ships and tides. The optimisation methods is were we feel much stronger. We had a couple of meetings in our Networks group in the mean time, to discuss our understanding of challenges in the humane algorithms area and in particular where are potential overlaps between our groups. I will be back to you with more concrete suggestions very soon.
Have a great weekend to everyone,
Best, Leszek
From: Ferson, Scott
Sent: 25 June 2021 7:37 PM
To: Gasieniec, Leszek <lechu@liverpool.ac.uk>
Subject: FW: projects in humane algorithms
Dear Leszek,
Of course we appreciate that your specialisations are focussed. We’ll be interested in your suggestions.
Unfortunately, I know what you mean about the end of the academic year. Our responsibilities are nothing compared to your MSc program, but they do seem to fill every waking minute nonetheless.
I presume that you meant to type “external” rather than “eternal”, but eternal made me laugh out loud, because these assessment duties do seem to be eternal, and possibly infinite in other dimensions besides time.
Sorry that I neglected to copy you on my response to Sebastian’s thoughtful email. It is below, with responses in blue.
Bon weekend.
Cheers,
Scott
From: Ferson, Scott
Sent: 16 July 2021 19:15:35
To: Gasieniec, Leszek
Cc: phristov_contact; Gray, Nicholas [nickgray]
Subject: RE: projects in humane algorithms
Dear Leszek,
We're proceeding with our work on what we call an "uncertainty language" which we intend to be a syntax for specifying and computing with general uncertainty structures that is epistemologically sophisticated yet shallow and simple enough for use by people with little quantitative training. It's part of a three-year project with Airbus funded by the UK government.
We have mathematicians, statisticians and programmers, but we're looking for some computer scientists to help address the big picture. We don't think the issues are terribly profound, but they are interesting and fashionable, and possibly important if they enable practicing engineers to up their game in uncertainty quantification.
Although you and your colleagues are not research active in programming languages, perhaps some of your 250 masters students would be interested in this area. We are used to working with multidisciplinary teams, and I think I could promise a good environment with collegial interactions for such students. We can make a list of possible project topics if this might be of use or interest to you.
Just a thought.
Good weekend.
Cheers,
Scott
From: Gasieniec, Leszek
Sent: 17 July 2021 8:16 AM
To: Ferson, Scott <ferson@liverpool.ac.uk>
Cc: phristov_contact <peter.hristov24@gmail.com>; Gray, Nicholas [nickgray] <Nicholas.Gray@liverpool.ac.uk>
Subject: Re: projects in humane algorithms
Dear Scott,
Many thanks for reaching out. I am sure I will be able to identify a couple of colleagues whose expertise is suitable for such projects. I will be back to you with some suggestions shortly.
Our current MSc cohort is very mixed including programmes and the quality of the students. The latter poses the greatest difficulty for the coordinator (myself) and the supervisors especially in the realm of the covid-19 pandemic. Having said this, there are always some students which tick the relevant boxes we just need to select these with care.
The projects for the main cohort already started in June but I believe we have about 50 extra students who started in January, and their projects kick start towards the end of August. I am not directly responsible for this group but can help to make the relevant arrangements is needed. Please send me the idea of such projects asap.
This is also to let you know that we did not give up work on humane algorithms. Our discussions and ideas hover around the concept of human algorithm interaction as opposed to more classical human computer interaction (HCI), where the notion of an algorithm is more elusive than a computer. As you may guess our main interest is still in algorithmic and optimisation problems with the focus on the interplay between
- the classical (provable, including parallel and distributed) algorithms,
- AI/machine learning methods, and
- a human / population.
The progress in our efforts is stalled to some extent by the summer period
but the full speed should be restored towards the end of the summer.
Best wishes, Leszek
On 17 Jul 2021, at 08:16, Gasieniec, Leszek <lechu@liverpool.ac.uk> wrote:
Dear Scott,
Many thanks for reaching out. I am sure I will be able to identify a couple of colleagues whose expertise is suitable for such projects. I will be back to you with some suggestions shortly.
Our current MSc cohort is very mixed including programmes and the quality of the students. The latter poses the greatest difficulty for the coordinator (myself) and the supervisors especially in the realm of the covid-19 pandemic. Having said this, there are always some students which tick the relevant boxes we just need to select these with care.
The projects for the main cohort already started in June but I believe we have about 50 extra students who started in January, and their projects kick start towards the end of August. I am not directly responsible for this group but can help to make the relevant arrangements is needed. Please send me the idea of such projects asap.
This is also to let you know that we did not give up work on humane algorithms. Our discussions and ideas hover around the concept of human algorithm interaction as opposed to more classical human computer interaction (HCI), where the notion of an algorithm is more elusive than a computer. As you may guess our main interest is still in algorithmic and optimisation problems with the focus on the interplay between
- the classical (provable, including parallel and distributed) algorithms,
- AI/machine learning methods, and
- a human / population.
The progress in our efforts is stalled to some extent by the summer period but the full speed should be restored towards the end of the summer.
Best wishes, Leszek
Leszek:
It took a few days to cobble some bits together, but here are 8 possible projects for the master’s students. Obviously, they can be edited or chopped up as needed.
Best regards,
Scott
Machine learning and statistics with interval uncertainties
Develop feasible strategies to compute optimal bounds for select practical problems in imprecise statistics and machine learning such as linear or logistic regression, t-tests, anovas, classifications, and clustering when input data are interval ranges rather than precise point values. Basic descriptive statistics for such data sets, subsuming and extending well know strategies developed for censored and missing data, have been developed in recent years. These problems are typically NP-hard in general, but feasible solutions are often possible in a wide range of important special cases that depend on the nature of the patterns of overlap among the intervals.
Conditional control under uncertainties
Define the generalisation of an IF conditional control structure for contexts in which the condition to be tested is uncertain or probabilistic. The solution may require multiple modalities as the interpretation of conditionals under uncertainty can vary widely. For instance, if the variable A is an interval of possible values in the range [0,5], then the assignments (left) could result in various values for the output variable B (right):
if A >= 3 then B = A + 2 else B = A - 2 [ 2, 7]
If A >= 3 then B = A + 2 else B = A - 2 [ -2, 3]
B = ifelse(A, 3, A+2, A-2) [ -2, 7]
B = GE(A, 3, A+2, A-2) [ -2, 1] U [ 5, 7]
each of which has useful applications. Tracking different branches from uncertain IF statements in interval and probabilistic programming can involve partial or simultaneous executions. Still more subtle interactions can occur when extending this generalisation to related control structures (if-else, if-elif-else, nested if, switch/case).
Markup language
Design a shallow markup language by which authors can annotate their natural language documents and texts to assist natural language processing in identifying textual semantics and argumentation structure. A text with such markup can be read by computer to identify its main point, ancillary points, subplots, arguments, lines of evidence, the evidence itself, its sources and provenance, definitions, examples and counterexamples, and other germane aspects that allow the computer to construct answers to questions about the logical and argumentative structure that are richer and more relevant than currently achievable using undirected schemes based on text and document analysis alone. The new markup language would enable the encoding of justification links into text to facilitate automatically answering questions like "why?", "how do you know?", "where's the evidence?", "so what?". By including markups that characterise the relation of claims based on inferences from quantitative analyses, the scheme could facilitate automatic uncertainty analysis to answer the question "are you sure?", or even re-running simulations or re-analysing data with a reader's new assumptions and null hypotheses. Arguments could be tracked in both forward and backward modes, i.e., making a (forward) exposition by going from observations and assumptions to conclusions, versus (backward) looking up evidence and justifications. The idea builds on available tools such as Jupiter and Knitr which integrate calculations into documents as literate programs sensu Knuth; hypertext and uniform record locators which support linking, cross-referencing, and documents with nonlinear structures (networks, trees, layers) and dynamically generate documents by reference & transclusion; HTML, CSS and TeX which make documents accessible on different platforms in modes suitable for readers' needs and preferences; the Semantic Web, Wiktionary, Orcid's e-science, the Resource Description Framework, and Web Ontology Language which make Internet data machine-readable via author marking of documents to encode of semantics along with the data with part-whole connections to support reasoning about people, animals, places, ideas, and anything that can be identified with a uniform resource identifier.
Uncertainty quantification interpreter
Develop an interpreter in Python for a language for uncertainty quantification (UQ). The interpreter will be able to parse expressions like "normal([2,3],1)", "about 5", "[3,4]+uniform(0,1)", and other types of convolutions, and express assumptions about intervariable dependencies. It will turn the expressions into unequivocal UQ code ready to be crunched under the hood by existing and in-development UQ libraries. Ideally this could be linked to the markup language project mentioned above and a domain-specific syntax project mentioned below.
Domain-specific syntax
Develop a domain-specific syntax for an uncertainty quantification (UQ) language for a calculus expressed in Julia. The end goal would be to enable users to write or otherwise express problems in this high-level UQ language and compile the code into standard Julia. With its metaprogramming features it is possible to do this entirely within Julia and implement a REPL mode with https://github.com/MasonProtter/ReplMaker.jl.
Programming language for next-gen science
Design a programming language for next-generation scientific computing that includes (i) native support for specifying, projecting and checking measurement units in calculations, (ii) basic support for computing with uncertainty in the form of intervals, probability distributions, and p-boxes as first-class objects, and (iii) language structures and facilities that ensure meaningful code comments and provenance traceback consistent with standard or express auditing requirements.
Distributed decision making
Derive an optimisation for distributed decision making employing stochastic optimisation techniques to identify distributed optimal modes, schedules and routes for travel from pre-computed risk maps that account for various costs of travel such as injury risks, environmental impacts, and economic costs for the travellers and other agents. Personally optimal decisions are identified by applying individualised decision rules that reflect the psychometric attitudes and past choices of a traveller. Different people have different requirements and different values. They have different "risk appetitites" that depend on their personalities and goals. In general, the tolerable risks and costs are different for commuters, shoppers, students, truckers, delivery persons, and leisurists. In fact, a single person has different requirements and values over time depending on the purpose of a trip. Given these constraints and preferences, an optimal distributed decision for this multicriteria multidecision identifies optimal modes, schedules and routes for travel from pre-computed risk maps that account for various costs of travel including (1) risk of death and injury for the traveller, passengers, pedestrians, and other travellers, (2) environmental costs in terms of likely emissions of vehicle exhaust, NOx, hydrocarbons, particulate matter, greenhouse gases during the trip, and the attributable ecological impacts associated with habitat destruction and dissection from infrastructure construction and maintenance, and (3) economic costs of the trip given the route, schedule, mode, and vehicle, but also the indirect economic costs associated with traffic congestion delays, health impacts from injuries and pollution, environmental degradation, and infrastructural investment and maintenance.
Privacy via blockchain
Design a scheme using blockchain accounting with strong encryption to interact with would-be travellers to elicit their psychometric preferences embodied in stated preferences and previous choices about trips in a way that ensures privacy protections and preclude archiving personal information about trips or preferences of individual travellers.
_______________________________________
From: Gray, Nicholas [nickgray]
Sent: 18 July 2021 18:59:40
To: Gasieniec, Leszek
Cc: Ferson, Scott; phristov_contact
Subject: Re: projects in humane algorithms
Leszek,
Hope you don’t mind me jumping it with a question
I wondered whether you could give me an example of the distinction that you draw between Human-Computer interaction and Human-Algorithm interaction?
Thanks
Nick
Hi Nick,
this is a very good question indeed.
One can see HCI as a very large/wide research field which includes also
what we would like to call human algorithm interaction.
Having said this, I have a strong impression that HCI mainly focuses on interfaces
and the input vs output relationship, without digging much in what happens in between.
(in a way it is more about syntax, hardware, and to some extent semantics
but less about correctness, efficiency, improvement/optimisation).
And this is where we see a room for human algorithm interaction which
goes beyond predefined computer vs the end user interaction.
One can imagine that there are various levels of human contribution
starting from an ignorant end user through an expert designer/creator.
We would like to see a user/advisor relationship between the three components
that I mentioned in the last message, i.e.,
- classical algorithms
- AI/ML methods
- human (with different levels of expertise)
In this relationship all three components are partners,
all three can act as the user or as the advisor depending on the current need,
all components can learn (and in turn improve) through interaction with one another.
Best wishes, Leszek
_