2024-07-31 JUL

Journal Club

Terra, we have a problem

Microsoft ends Broad Institute contract, $15M last year, triggering layoffs

source: https://www.statnews.com .

https://x.com/jermdemo/status/1816135124326711388
https://x.com/OmicsOmicsBlog/status/1815755567626850557

We've tried/proposed

handholding
20% budget
apps - https://episphere.github.io/terra
epiTerra Extension: https://chromewebstore.google.com/search/epiTerra
chrome://extensions/?id=gfnefhciljgadngmaongkndnlhcaenpc

Compare with https://portal.isb-cgc.org ... governance at the API bearer token + well supported BigQuery SDKs

Hackathon

genAI (Gemini) via Google Vertex AI (Jeya)

Multi-modal Gemini

API

Gemini Vertex AI: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#gemini-text-only-samples-drest
Open AI: https://cloud4bio.github.io/
local Llama? - https://gemini.google.com/app/aed4b306c8ac30c9

In-browser Llama hosting

how to host llama model in-browser?

Hosting the Llama model entirely in the browser is still an emerging area, but here are the current approaches and resources:

WebGPU:

Thiggle: This project (https://matt-rickard.com/llama-2-in-the-browser) uses WebGPU for running Llama models in the browser. However, it's limited to smaller models due to current hardware constraints.

Model Compression:

You can quantize the Llama models to reduce their size significantly. This might make it possible to load them in the browser, but it could come at the cost of reduced performance.

Client-Server Hybrid:

Ollama: This project allows you to host the model locally and interact with it through a browser interface. It provides an OpenAI-compatible API, making integration easier.

Cloud-Based Inference:

Model API Gateway (Thiggle): This service lets you compare different models (including Llama 2) in the cloud. It might not be purely in-browser, but it offers a convenient way to experiment without local setup.

Additional Considerations:

Hardware Requirements: Running LLMs in-browser is computationally intensive and requires powerful devices with good GPU support.
Browser Compatibility: WebGPU is still a developing standard, and support varies across browsers.

For further research:

Check the subreddit r/LocalLLaMA for discussions and updates on running LLMs locally.
Look into the Llama.cpp project, which might offer insights into browser-based deployment.
Follow developments in WebGPU and browser technologies as they could enable more efficient in-browser LLM hosting.

Jeya's Links

WebLLM

TinyLlama: An Open-Source Small Language Model
https://arxiv.org/pdf/2401.02385

[Lorena - LLM cost control] https://medium.com/google-cloud/how-to-reduce-your-ml-model-inference-costs-on-google-cloud-e3d5e043980f

Connect

Heartbeat API (for reference)

https://api-myconnect-stage.cancer.gov/heartbeat

https://observablehq.com/@jonasalmeida/connectheartbeat