Comments / Criticism

It's been over a month since the last ML Arxiv post, and these may be less frequent or (for the better) include only papers where I have more substantial comments. I've also thought, as /r/MachineLearning and ML Twitter are struggling, someone should have a script which watches new HuggingFace models, summarizes the Readmes to decide if they're interesting, and if possible matches it to an appropriate leaderboard for a specialized model.

Meta continues to withhold the 33B parameter version of LLaMa 2 - this has something to do with failed safety/alignment tests. Notably this model (like GPT-4Chan) does better on TruthfulQA. Some researchers have been investigating a harmful agreeableness or 'sycophancy' of human-aligned models, so I suppose these are connected in some way with falling for popular misconceptions covered by TruthfulQA.

This was an interesting post about the first database sharding paper - apparently this often-cited paper cannot be found - there are even multiple strains of the citation: https://shkspr.mobi/blog/2021/06/where-is-the-original-overview-of-shard-paper/

This reminds me of the BIG-Bench paper citations - as I've written before, there are abbreviated citations, ones with my name as Nick or Nicholas, and new authors added two months ago when BIG-Bench was accepted to a conference.

New section: Chat-O.M.G.

This person claims to be a public defender using ChatGPT to process files. They need help with prompting because ChatGPT won't engage with the graphic content. Replies are concerned, but also recommend going through the API: https://www.reddit.com/r/ChatGPTPro/comments/15s1cjt/attorney_use_question_prompt_for_use_with/

These threads give an insight into generated content oozing out into the web:

Recently I noticed if I search for an upcoming book by title and author (Izgil Waiting to Be Arrested at Night) Amazon shows several fake books claiming to be a "Summary and Analysis".

Papers

A Theory on Adam Instability in Large-Scale Machine Learning

We present a theory for the previously unexplained divergent behavior noticed in the training of large language models…

Meta finds some unusual spikes in the popular Adam optimizer, presumably while training LLaMa. They recommend some earlier papers' regulating methods, not switching to other optimizers.

(Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs

We demonstrate how images and sounds can be used for indirect prompt and instruction injection in multi-modal LLMs. An…

Adversarial noise has long been used to make ML models mis-label images. This takes it a step further and explores gradients of an open multimodal assistant model, such that the modified image determines the model's response (such as "Assistant: ok, I will talk like a pirate"). HOW??

I'd be curious if this is working better on smaller models, with instruction tuning (was there a "talk like a pirate" example in the instruction dataset?), and whether it responds to text instructions written in image form? Otherwise I am baffled how there's any foothold in the gradient space to start changing the image.

Bayesian Flow Networks

This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set…

ML Twitter is circulating this because the first author, Alex Graves, is a well-known scientist at DeepMind (the institution here is NNAISENSE, a robotics startup which has been around for years). It's more efficient to train than noising and de-noising of a diffusion model, and gets better results on image classification benchmarks.

This would be cool for the Bayesians / probabilistic AI squad who I saw in Finland.

Here's the Tweet summary of the paper:

Challenges and Applications of Large Language Models

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine learning discourse within a few years…

This got mentioned in Hacker News comments for the blog post Open challenges in LLM research - both are neat and give onramps for someone to explore messing with LLMs.

Everyone here is giving credence to mechanistic interpretability research. I continue to be baffled by this because they are using toy problems. OpenAI already fast-forwarded to GPT-4 explaining every neuron in GPT-2, and you just get stuff like a "Canada neuron". Human brains have neurons for individual celebrities, but they're not going to be located in the same place between brains, and how close does that get us to understanding the inputs (seeing a face, hearing a voice), and outputs (that might be Tom Cruise?) and associations (I saw Mission Impossible at my neighborhood movie theatre).

Code Llama: Open Foundation Models for Code

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art…

This just came out, and looks promising (people had complained about using the original LLaMa v2 on code).

The context is extended to 100k tokens (!!) with a new technique on top of RoPE positional embeddings. This reminded me of SuperHOT and Meta's previous work, but looks to be something new. A footnote has a shout-out to a /r/LocalLLaMa thread where someone posted concurrent work - what a time to be alive.

The model was trained on a mix of code, natural language, and natural language discussing code topics. The safety section uses TruthfulQA and toxicity benchmarks for natural language; there's limited discussion of code vulnerabilities or breakout attempts or other exotic code model risks. They do mention the "kill a child process" issue with safety limits, but claim you can simply reword your queries.

CyberNative/CyberBase-13b

I don't know any comparable model to this? The example shows it generating a plan for a pen test. When I looked up the source (CyberNative) it was a bit confusing, but maybe this is will be developed more.

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

The misuse of large language models (LLMs) has garnered significant attention from the general public and LLM vendors…

Paper-fication of DAN and other prompts used to "jailbreak" ChatGPT.

Evaluating RAG pipelines with Ragas + LangSmith

Editor's Note: This post was written in collaboration with the Ragas team. One of the things we think and talk about a…

Interesting LangChain paper trying to simplify retrieval model evaluation. Code: https://github.com/explodinggradients/ragas

I've long wanted to explore RealtimeQA and similar ideas (TimeQA) where the model is not answering questions from its training data, but in fact retrieves new and surprising facts from new, post-training documents. Ideally the model would update its weights (ROME, though this method has its issues) or some knowledge base, or just have the new info in memory long enough to answer questions correctly.

Tangential: MEMIT is an alternative to ROME (updating facts of language models) which I hadn't heard about before.

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also…

Team extracts claims and sources from the output of a GPT model, and uses a Python interpreter, search engines, etc. to try and check. The issue of GPT truthiness has reminded me of a fact-checkers talk at the New Yorker Festival (and this related article with Daniel Radcliffe) so it's good to see a system to break out individual claims.

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and…

Rare Apple paper, image model, I have no idea what else they're talking about.

Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc

The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also…

Cyc is an old school AI database with millions of common sense statements and association. I guess this is the closest I'll get to knowing what it'd be like to hear what that school of thought thinks about LLMs. They pitch it as something which could set LLMs to be truthful or verifiable, but perhaps the most usable parts were Cyc "proofs" which look like Chain of Thought.

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

Safety lies at the core of the development of Large Language Models (LLMs). There is ample work on aligning LLMs with…

Authors use a basic Caesar cypher to get around GPT safety checks.

How is ChatGPT's behavior changing over time?

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models…

A puzzling trend in the ChatGPT world. This helped me to convince a student to put the dates when they accessed OpenAI's API into a paper. Although the social media conversation has usually been rumors about OpenAI "nerfing" or softening potentially risky outputs of ChatGPT, authors of this paper went more scientific. They show how the model appears to lose ability on a specific, objective task (identifying prime numbers).

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Low-rank adaptations (LoRA) are often employed to fine-tune large language models (LLMs) for new tasks. This paper…

This is part of the slowly developing world of LoRA weights. Conceptually (but NOT technically/architectually) you might compare it to AdapterHub layers. These make it easier to fine-tune large models and store those delta weights somewhere for reapplication. Talks from Microsoft hint that OpenAI uses this technique to make it possible to switch between finetuning for the upcoming GPT-3.5 finetuning API.

The paper here talks about NOT the DevOps-y side of switching between LoRA weights, but creating a mixture-of-experts architecture for the right model to activate on a variety of tasks.

Modeling Bitcoin Value with Vibes. This model has an R² of 0.97

OK I am not a smart stats person, but the Hacker News comments here discuss how it's deceptive to evaluate statistical models of prices like this.

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

A key technology for the development of large language models (LLMs) involves instruction tuning that helps align the…

Instruction tuning with multiple languages (including Indonesian) on HuggingFace's BLOOM and Meta's LLaMa (v1). I immediately wondered, don't we already have a BLOOM instruct model? That (BLOOMZ) was released in November 2022 and used "multitask prompted finetuning", but this Okapi series uses RLHF and performs better (even outperforming BLOOM on English tasks).

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators

Large language models that exhibit instruction-following behaviour represent one of the biggest recent upheavals in…

Authors drew attention for saying LLaMa v2 still fell short of their expectation for open models. Instead of looking for training code or weights, this framework wants details about the training data, RLHF deltas, etc. The most open model here (xmtf) is new to me, but I do see good things about the runner-up (Open-Assistant).

Open-sourcing SQLCoder: a state-of-the-art LLM for SQL generation

We are thrilled to open-source Defog SQLCoder: a 15B parameter LLM that outperforms gpt-3.5-turbo on text-to-SQL tasks.

SQL-specific model (which is quite useful when Code-LLaMa does not discuss SQL at all).

SDXL Wrong LoRA - v1.0-Diffusers | Stable Diffusion LoRA

This model was originally uploaded to Huggingface by Max Woolf

If you look deeper at the LoRA finetuning, people have been sharing LoRAs of popular StableDiffusion models on this site (Civitai), which is new to me. Author uses fine-tuning with negative prompts to nudge the quality of generations, and other Hacker News commenters use low quality images in training, too: https://news.ycombinator.com/item?id=37211519

Shepherd: A Critic for Language Model Generation

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities…

Meta's research into providing additional context, fact-checking, and commentary into chats with LLMs. The model is trained on a dataset of replies from specific subreddits, intentionally limited to threads written before the release of LLaMa.

Simple synthetic data reduces sycophancy in large language models

Sycophancy is an undesirable behavior where models tailor their responses to follow a human user's view even when that…

DeepMind research studies over-polite and belief-matching LLMs. They have a basic prompt template where a synthetic user gives their name, credentials, and an opinion before asking the model to answer the question. Smaller models don't know enough facts to push back on math errors from supposed math professors. I wonder if they were studying what names and credentials would get appropriate answers and/or respect from the model, but in this case they just evaluate whether generated examples can help the model be truthful.

Towards Generalist Biomedical AI

Medicine is inherently multimodal, with rich data modalities spanning text, imaging, genomics, and more. Generalist…

Google/DeepMind project where a multimodal model can accomplish many medical tasks (typically QA, not retrieval or tools-based).

Georeactor Blog

ML Arxiv Haul #22

Comments / Criticism

New section: Chat-O.M.G.

Papers

A Theory on Adam Instability in Large-Scale Machine Learning

(Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs

Bayesian Flow Networks

Challenges and Applications of Large Language Models

Code Llama: Open Foundation Models for Code

CyberNative/CyberBase-13b

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Evaluating RAG pipelines with Ragas + LangSmith

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

How is ChatGPT's behavior changing over time?

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Modeling Bitcoin Value with Vibes. This model has an R² of 0.97

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators

Open-sourcing SQLCoder: a state-of-the-art LLM for SQL generation

SDXL Wrong LoRA - v1.0-Diffusers | Stable Diffusion LoRA

Shepherd: A Critic for Language Model Generation

Simple synthetic data reduces sycophancy in large language models

Towards Generalist Biomedical AI