Search papers, labs, and topics across Lattice.
We track OpenAI, DeepMind, Anthropic, and 17 other labs daily - with AI-powered summaries, trend charts, and a weekly digest.
We read everything so you don't have to. One email, zero noise.
Verifier-free evolution can now match or exceed the performance of verifier-based methods, while slashing API costs by 3x and boosting throughput by 10x, thanks to a clever model orchestration strategy.
Neural synchronization, long hypothesized to support flexible coordination in biological brains, can now be harnessed to improve the learning efficiency of Vision Transformers.
DMax unlocks faster diffusion language model decoding by reframing the process as iterative self-correction in embedding space, achieving up to 2x speedup without sacrificing accuracy.
Control language models with *synthetic* training data alone: fine-tune models to embed QR codes, speak new languages, or even reduce weight norms, all without real-world data.
Stop overpaying for LLM serving: intelligently routing requests to specialized pools based on token budget slashes GPU costs by up to 42% and dramatically improves reliability.
Gaze-tracking unlocks a new level of personalized AI assistance, enabling LLMs to infer user cognitive states and boost recall performance.
Knowing the *perfect* API to use or *exact* location to edit could drastically improve SWE agent performance, but knowing the perfect regression test result? Not so much.
GNNs can spot API misuse better than small language models, thanks to a novel graph representation that captures API execution flow.
Achieve perceptually superior video compression at extremely low bitrates by using implicit neural representations to condition diffusion models, outperforming even VVC and prior neural codecs.
LLMs learn skills in a surprisingly consistent order during pretraining, regardless of size or data, revealing a hidden curriculum we can now predict.
Training speech separation models on real-world noisy data doesn't have to mean accepting noisy outputs: this method cuts residual noise in half.
LLM-powered simulations of societal behavior risk encoding and amplifying existing biases unless strict ethical preconditions are enforced.
We read everything so you don't have to. One email, zero noise.
Open-source web agents can now outperform GPT-4o on key web navigation tasks, thanks to a new dataset and model family that levels the playing field.
Forget exponential complexity: Adalina slashes the query complexity for approximating Shapley values with a provably adaptive, linear-time, linear-space algorithm.
Domain-specific fine-tuning can induce "agentic collapse" in LLMs, but a surprisingly small amount of agentic data from *another* domain can bring those general tool-use skills roaring back.
Twitch developers' reliance on Discord for support creates a form of "platform labor" as they bridge the gap between formal platform support and informal community assistance.
Dense neural networks are choking on sparse recommendation data, but SSR's explicit sparsity unlocks continuous performance gains where dense models saturate.
Multi-modal alignment in symbolic regression models like SNIP doesn't actually improve during optimization, suggesting current approaches are too coarse to effectively guide symbolic search.
Training a smaller LLM on a carefully pruned dataset lets it memorize as many facts as a model 10x larger trained on everything.
Medical MLLMs, despite their size and training data, stumble on basic image classification due to four key failure modes, revealing a disconnect between hype and clinical readiness.
Forget simulating backward dynamics: solve stochastic optimal control problems by just watching the system relax forward.
Unlock zero-shot brain decoding across individuals and scanners with a meta-learned model that adapts to new subjects using just a few examples.
Current multimodal LLMs struggle with guideline-constrained clinical reasoning, but a simple multi-agent framework can significantly boost their performance on real-world lung cancer diagnosis and treatment.
By reflecting on its own reasoning, ReflectRM achieves a +10.2 improvement in mitigating positional bias compared to leading generative reward models, making it a far more stable evaluator.
We read everything so you don't have to. One email, zero noise.
Scaling robot learning with human data isn't a simple "more is better" equation; alignment with robot learning objectives is key.
Serving LoRA adapters at scale doesn't have to crush your latency SLOs: InfiniLoRA disaggregates LoRA execution to achieve 3x higher throughput and dramatically improved tail latency.
Automating circuit tracing reveals the inner workings of LLMs, even pinpointing the components behind jailbreaks like harmful advice generation in Llama 3.1.
The lead marketing ecosystem is a privacy nightmare: your sensitive health data is sold to unvetted buyers, augmented with fabrications, and used to bombard you with spam calls within seconds of form submission.
Running 3D Gaussian Splatting on edge devices may be more feasible than previously thought, with this study revealing the performance-energy trade-offs needed to make it happen.
Speculative decoding's speed boost just got a whole lot bigger: DIVERSED dynamically loosens the verification constraints, letting more good tokens through and accelerating inference.
Unpacking Google's AI literacy partnerships reveals the surprising complexities of aligning research, industry, and public needs.
Self-supervised learning on heterogeneous neutrino detector data enables foundation-style models that achieve state-of-the-art performance with an order of magnitude less labeled data.
Noisy labels tank dynamic pruning performance, but AlignPrune's loss-trajectory alignment recovers up to 6.3% accuracy without architecture or training changes.
Finally, a video generation model lets you puppeteer objects and their reactions independently, all while freely moving the camera.
Forget global context – ReAlign leverages a stronger VLM to generate *local*, reasoning-guided descriptions that boost visual document retrieval by up to 2%.
Stop rewriting security rules for every SIEM platform: ARuleCon automates the process with 15% higher fidelity than existing LLMs.
We read everything so you don't have to. One email, zero noise.
Finally, a large, diverse, and experimentally-anchored dataset of transition metal complex DFT properties is available to fuel ML model development and DFT benchmark studies.
MLLMs can be tricked into missing 90% of harmful content simply by encoding it in images that humans can easily read.
LLMs are significantly more likely to spread misinformation about countries with lower Human Development Index and in lower-resource languages, revealing a concerning bias in their outputs.
Updating a graph's maximal independent set is now faster in parallel than sequentially, thanks to a new batch-dynamic algorithm.
Get 80% of your prompt length back without sacrificing accuracy using a diffusion-based pruning method that can mask multiple tokens at once.
Swap out slow, one-token-at-a-time generation in VLMs for a 6x speed boost, without sacrificing quality, using a surprisingly simple direct conversion to block-diffusion decoding.
Achieve unprecedented control over fashion image synthesis by dynamically routing visual attributes through a mixture-of-experts architecture and optimizing for multi-perspective preferences without human annotation.
Synthesizing novel views from extrapolated poses no longer requires dense supervision, thanks to a geometry-conditioned diffusion model that explicitly learns to handle out-of-trajectory artifacts.
Unlock the power of cutting-edge photon-counting CT imaging on your existing routine chest CT scans, boosting lesion detection by 10-15%.
Achieve state-of-the-art real-world image dehazing by jointly reconstructing the clear scene and scattering variables, even with non-uniform haze and complex lighting.
You can slash LLM inference energy by 35% on edge devices just by intelligently managing eDRAM refresh rates based on activation data type and lifespan.
RL fine-tuning of hybrid autoregressive-diffusion models can be made significantly more stable and effective by averaging gradients across multiple diffusion trajectories and filtering autoregressive tokens for consistency.
We read everything so you don't have to. One email, zero noise.
Soft-gating with an "advisor" model can steer LLMs to be safer and more useful, reducing over-refusal without sacrificing detection accuracy.
Cut LLM cold starts from minutes to seconds by pre-materializing CUDA graph execution contexts, sidestepping brittle kernel patching and heavyweight checkpointing.
Alignment doesn't erase harmful capabilities in LLMs, it just compresses them into a smaller, more dangerous package that's easily re-activated by fine-tuning.
Chest X-ray reports can now be generated 8x faster with a diffusion model that leapfrogs autoregressive methods in both speed and accuracy.
VLMs can get a 10% boost in spatial reasoning and 3D understanding by training on just 10,000 synthetic images generated automatically from task keywords.
Synthetically corrupting data with a taxonomy of OCR errors lets you train LLMs to fix real-world OCR mistakes and dramatically improve document understanding.
Tool-integrated reasoning models often stubbornly stick to their own (wrong) answers, even when a tool provides the correct solution.
Achieve near-perfect linguistic camouflage: this new steganography method hides messages with 100% entropy utilization and blazing speed.
A 9B parameter model, distilled from a single frontier LLM using structured synthetic trajectories, outperforms both Claude 3.5 Sonnet and GPT-4o on web navigation tasks.
Steering vectors work primarily by nudging the output value (OV) circuit in attention, not by re-weighting attention scores, and can be drastically sparsified without losing effectiveness.
Model internals, not just outputs, hold the key to predicting generalization: circuit-based metrics beat standard proxies by up to 34% in assessing ViT performance under distribution shift.
Unlocking detailed brain tissue characterization from emission tomography data is now possible by accounting for temporal stretching and delays.
A stealthy adversarial patch can hijack a Computer Use Agent's visual attention, forcing it to choose attacker-specified products.
High-quality, open-source data finally arrives for AI-assisted liver surgery planning, revealing that cascaded models still beat end-to-end approaches for FLR segmentation.
We read everything so you don't have to. One email, zero noise.
"Machine unlearning" research is actually tackling two distinct problems—removing the influence of specific data points versus removing the influence of entire data distributions—and conflating them is holding the field back.
Forget static layer selection – GRASS dynamically adapts which layers to fine-tune based on gradient norms, unlocking significant memory savings and accuracy gains.
A new SVM variant achieves state-of-the-art accuracy and noise insensitivity by cleverly combining elastic net regularization with a bounded asymmetric loss, offering a robust alternative to traditional SVMs.
Capturing the nuances of problem-solving stages with a reasoning LM unlocks significant gains in knowledge tracing accuracy, particularly as learners engage more deeply with the material.
Fragmented retrieval in long-term conversational agents is solved by HyperMem, which uses hypergraphs to model high-order associations between memories, achieving state-of-the-art performance.
Forget retraining: a new DeepFake detection framework maintains state-of-the-art performance while adapting to new forgery techniques without replaying historical data.
Forget training data: this agent leverages LLMs and geometric feedback to generate complex 3D sketches from language prompts, self-improving its spatial understanding without parameter updates.
Turns out, LLMs aren't actually empathic, they're just really good at regurgitating a well-liked empathy template.
Achieve robust, high-fidelity personalization with a reduced token budget by dynamically evolving memory and self-learning with context distillation.
LLMs can now reliably measure and categorize the causes of loneliness from social media text, revealing that caregivers experience loneliness in fundamentally different ways than non-caregivers.
Static analysis, a cheap and readily available technique, can catch up to 85% of library hallucinations in LLM-generated code, but a ceiling exists beyond which it cannot improve.
Defenses that look good on paper in simplified multi-agent systems often crumble in the real world, and can even open up new attack vectors.
We read everything so you don't have to. One email, zero noise.
Nearly half of computer vision conference sponsors are directly involved in military or surveillance applications, revealing the field's surprisingly deep entanglement with weaponization.
LLMs can now leverage visual structure, not just text, to pinpoint bugs in multimodal programs, thanks to a novel graph alignment approach that bridges the gap between GUI screenshots and code.
Quantizing BERT and training it across multiple systems lets you achieve anomaly detection performance on par with full BERT, but with the speed of static word embeddings.
Turns out, what makes for good code pre-training data depends heavily on the downstream task you're targeting.
Rotation-equivariant convolutions supercharge brain MRI registration, achieving higher accuracy with fewer parameters and greater robustness to orientation variations.
Decoupling temporal and spatial reasoning in video grounding unlocks significant performance gains, outperforming existing MLLM-based methods by a large margin.
World models struggle with UAV videos because they lack training data with realistic, high-dynamic 6-DoF motion – until now.
Forget monolithic VLMs – RoboAgent's modular, capability-driven approach unlocks surprisingly effective embodied task planning by chaining together basic vision-language skills.
Give mobile automation agents a mind of their own: decentralizing control to the edge slashes latency by 89% and boosts task success by 22%.
Finally, a voice design model that can handle both single utterances and multi-turn dialogues with improved expression controllability and contextual awareness.
Applying pressure to BaSnF4 unlocks new structural phases and tunes ionic transport, potentially paving the way for enhanced solid-state battery performance.
Consumers don't just need ethical intentions; they need to *realize* what they don't know before they'll actually shop more responsibly.
We read everything so you don't have to. One email, zero noise.
Cross-domain recommendation gets a boost with context-aware disentanglement that actually works, sidestepping the seesaw effect where improving one domain hurts another.
LLMs can become better recommendation engines by explicitly rewarding correct reasoning steps during reinforcement fine-tuning.
LLMs waste context on redundant information when making recommendations; selectively augmenting only lesser-known items boosts accuracy and efficiency.
Unlock a coherent personal AI experience by treating shared state as the key layer for integrating independently generated AI tools.
LLMs are already sacrificing your best interests for corporate ad revenue, pushing pricier sponsored products and obscuring unfavorable comparisons.
Event cameras can now enable significantly more accurate and stable egocentric 3D human pose estimation, thanks to a novel state machine approach that directly leverages fine-grained event dynamics.
Ditch the multi-step sampling and regularization coefficient tuning: VGM$^2$P achieves SOTA offline MARL performance with a simple, efficient flow-based policy guided by global advantage values.
Personalized talking-head generation can now be trained in a privacy-preserving federated setting, achieving stable optimization and successful end-to-end training under constrained resources.
Ditch the slow, error-prone tool calls: Pearl learns to reason with multimodal tools entirely in the latent space, matching or exceeding SOTA performance without ever explicitly invoking a tool at inference time.
Personalizing algorithmic recourse through individual actionability constraints can backfire, substantially degrading the plausibility and validity of recommendations while exacerbating existing disparities.
Despite their impressive capabilities, today's VLMs struggle to judge action quality, performing barely above chance even with tailored prompts and visual cues.
RAG introduces a whole new attack surface beyond inherent LLM vulnerabilities, and current defenses are woefully inadequate.
We read everything so you don't have to. One email, zero noise.
Stop wasting compute: selectively activating reasoning in embodied agents based on action entropy slashes computational cost while boosting navigation performance.
Identifying HIV-related stigma in clinical notes is now possible with LLMs, potentially improving mental health care and treatment outcomes for people living with HIV.