2024-01-03 JAN
News from NeurIPS 2023, delivering AI Assistants
Journal Club
NeurIPS | 2023 [Jeya]
Thirty-seventh Conference on Neural Information Processing Systems
News / Highlights / paper awards
Ideas relevant to epiVerse
Foundation models creating a paradigm shift in AI: https://arxiv.org/abs/2108.07258. Recipe by Andrew Ng—
Build prototype using LLM APIs.
If safe, deploy immediately (no testing).
Monitor performance. If you spot a tricky example, add the example to your hand-crafted eval dataset. When tuning (including prompt engineering), examine results on eval set. Eval set can be ~10 examples.
Optional: develop systematic error metrics that are more relevant to your KPI.
Optional: invest in building a large eval set.
Evaluation packages: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
Design pattern 1: for applications with a clear right answer, measure accuracy.
Design pattern 2: for applications with several good answers or approximate answers, develop an LLM agent to evaluate the original LLM's output. If an approximate gold standard answer (or references sources for one) is available, include it into the agent's knowledge base.
Exploring open "source" models: versioning, privacy, product vs. model concerns
Mistral: https://mistral.ai/
DeepInfra: https://deepinfra.com/pricing
AnyScale: https://www.anyscale.com/endpoints (mentioned by Yann LeCun)
Agents
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings (https://arxiv.org/abs/2305.11554)
Ideas relevant to Federated Learning
EPFL (https://www.epfl.ch/labs/mlo/) is deeply involved in P2P federated learning. Potential ally in the GDPR zone?
EPFL's efforts on P2P with WebRTC: https://github.com/epfml/disco
EPFL's efforts on efficient P2P learning through "Epidemic" learning: Boosting Decentralized Learning with Randomized Communication https://arxiv.org/abs/2310.01972
Test of time award winner - lessons learned from word2vec
Paper: Distributed Representations of Words and Phrases and their Compositionality (https://arxiv.org/abs/1310.4546)
Semi-supervised objectives + large corpora = key to NLU
Skip gram
CBOW
Next word prediction
Fast, parallel, weakly-synchronized computation dominates ML
Allows scaling, which give better results
A single parameter server that distributes model parameters across multiple machines and orchestrates their learning. Led to their next big paper. This was motivated by negative sampling.
Hate for locking and synchronization was the biggest enabler for works like these.
Focus your compute where it really helps improve your learning
Common tokens are easier to learn and are less informative. So negative sample the tokens that were frequent both in inputs and targets.
Make models simpler and faster (parallel) by focussing on the important problems.
Word2vec > RNNs. Transformers > LSTMs.
Notes from Jeya's presentation:
https://jalammar.github.io/illustrated-word2vec , https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf
Tokenization helps solve nuanced problems
Tokenization strategy: which bits of text gets a vector. Which bit to focus on?
It can be used for phrase representations. Compound concepts/nouns are represented by multiple words. Bigrams made up of two infrequent words are focussed on more.
Have a flexible strategy on representing words.
Sub-word tokenization is still used today in Transformers
Treating language as a sequence of dense vectors is powerful.
Representing concepts as a dense vector. Operators in that space look at geometrical relationships in that space.
1985 Rumelhart suggested this. Neuroscientists have debated this for decades. These were just conjectures.
Word2vec: syntactic and semantic relationships were represented geometrically (PCA).
By simple addition/subtraction you can solve analogy problems.
Paris - France + Italy = Rome
Sushi - Japan + Germany = Bratwurst
Czech + currency = koruna
French + actress = Juliette Binoche, Vanessa Paradis, Charlotte Gainsbourg
FDA's entry - synthetic data
Knowledge-based in silico models and dataset for the comparative evaluation of mammography AI for a range of breast characteristics, lesion conspicuities and doses (https://arxiv.org/abs/2310.18494).
Hackathon
Delivering AI Assistants
recall last FAIR Friday, and the use of https://platform.openai.com/playground , can/should we poxy governance?
Google Cloud
Preprint dynamics
for example https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10593073
DeArraying update
Creating standards, acting on them remotely [Aaron]
Strategic planning
<keep thinking and voicing your opinions, the intersection with federated learning and FAIR have emerged at the intersection>