Cleanlab, 75 Hawthorne Street, Suite 560, San Francisco, CA (2026)

08/08/2024

Considering switching LLMs but dreading redo-ing all of your LLM Evals? Our latest blog post highlights how cleanlab can help you scale your evaluations with multiple reviewers, without sacrificing accuracy!

CROWDLAB improves your team's LLM Evals process by automatically producing reliable ratings and flagging which outputs need further review.

11/02/2023

Would you trust medical AI that’s been trained on pathology/radiology images where tumors/injuries were overlooked by data annotators or otherwise mislabeled? Most image segmentation datasets today contain tons of errors because it is painstaking to annotate every pixel.

Introducing Cleanlab for segmentation to automatically catch annotation errors in image segmentation datasets, before they harm your models! Quickly use Cleanlab to detect bad data and fix it before training/evaluating your segmentation models. This is the easiest way to increase the reliability of your data & AI!

We've feely open-sourced our new method for improving segmentation data and published a paper on the research behind it.

Learn more:
- Blog: https://cleanlab.ai/blog/segmentation-errors/
- Research Paper: https://arxiv.org/abs/2307.05080
- 5min Code Tutorial: https://docs.cleanlab.ai/stable/tutorials/segmentation.html
- Star our Github to support open-source and open science in Data-Centric AI: https://github.com/cleanlab/cleanlab

Introducing new methods for estimating labeling quality in image segmentation datasets.

10/05/2023

When generating synthetic data with LLMs (GPT4, Claude, …) or diffusion models (DALLE 3, Stable Diffusion, Midjourney, …), how do you evaluate how good it is?

Introducing: Quality scores to systematically evaluate a synthetic dataset with just one line of code! Use Cleanlab’s synthetic dataset scores to rigorously guide your prompt engineering (much better signal than just manually inspecting samples). These scores also help you tune settings of any synthetic data generator (eg. GAN or probabilistic model hyperparameters) and compare different synthetic data providers.

Cleanlab scores comprehensively evaluate a synthetic dataset for different shortcomings including unrealistic examples, low diversity, overfitting/memorization of real data, and underrepresentation of certain real scenarios. These scores are universally applicable to image, text, and structured/tabular data!

Want to see a real application of these scores? Check out our new blog on prompt engineering to generate a good synthetic image dataset with Stable Diffusion ⬇

Systematically evaluate synthetic datasets via quantitative scores. Use these scores to guide prompt engineering and other synthetic data generator optimizations.

08/09/2023

🤔 How do you trust data analytics built on bad data?

❓ Ever wonder why there’s a mismatch between your analytics report and actual outcomes? Or perhaps, found yourself doubting the reliability of how your dataset was collected? You're not alone.
🔍 The Problem:
Your analytics are only as good as your data. It's a principle as old as data processing itself that we've all heard a million times: "Garbage in, garbage out". This means even the slickest, most advanced analytics tool can mislead if fed flawed data. Analytics based on raw data with errors can lead to incorrect conclusions and suboptimal decisions.

💡The Solution: Cleanlab Studio
1️⃣ Automated Data Cleaning: Beyond just presenting data, our AI automatically catches issues in your data like categorization errors, outliers, (near) duplicates, and ambiguous data and rectifies them in real-time.
2️⃣ More Accurate Analytics: By improving the underlying data, Cleanlab Studio ensures the resulting metrics you see are always on point.
3️⃣ Seamless Integration: A breeze to incorporate, with its user-centric design and compatibility with numerous data environments and modalities for ingestion and export.

🌐 The Big Picture: By using Cleanlab Studio, you're not just opting for another analytics tool. You're investing in clarity, confidence, and a clearer path forward by improving your data. Stop doubting your reports and manually hunting for errors — let Cleanlab Studio build your trust back in your data!

🔗 Take the Next Step!
Looking to harness these incredible benefits for your enterprise? Sign up for the enterprise offering of Cleanlab Studio below NOW! https://cleanlab.ai/sales

Cleanlab offers solutions that easily reduce time/cost to turn unreliable data into reliable models and analytics. We provide Data-Centric AI software to automatically find and fix issues in your datasets and assess quality so you can trust your data and the solutions you build on your data.

08/03/2023

💼 💰 📊 Cleanlab Studio saves law firm millions of dollars (and a month of litigation time)!

Since the VentureBeat announcement of seed round and the launch of Cleanlab Studio for Enterprise, the initial traction is exciting but we’re getting a lot questions about specifically how we helped certain markets so we’d like to share a legal/law application.

As an enterprise customer of Cleanlab Studio, Berkeley Research Group saved their client (a major law firm in the U.S.) a MONTH of time for a litigation case, saving them millions of dollars for a fraction of the cost.

Now that Cleanlab Studio for Enterprise is out, we’re excited to announce a new feature for Enterprise — support for relevance determination and e-discovery which we added this quarter that completes the end to end, full featured AI reliability solution for enterprise that takes any real world dataset and deploys reliable models. Cleanlab Studio autonomously detects mis-categorized legal documents, enhancing the accuracy of categorization and helping to avoid costly mistakes and potential bias.

What does this mean for law firms and consultancies?
💰💡 Save millions of dollars and working-hours
💰💡 Accomplish with 5 lawyers what previously required 50
💰💡 Reduce the potential for errors.

Imagine if you could reduce your workforce needs for e-discovery by 90%, while simultaneously improving the accuracy of your relevance determination. What kind of impact would that have on your bottom line?

➡ Check out the blog to learn how your firm can leverage Cleanlab Studio to enhance your e-discovery process and save valuable resources!
http://cleanlab.ai/blog/studio-legal-discovery/

➡ Sign up for enterprise offering of Cleanlab Studio below.
http://cleanlab.ai/sales

Use AI software to automatically identify mis-categorized legal documents and provide more accurate relevance determination.

07/31/2023

“In my experience, the phrase ‘you are what you eat’ is exponentially more applicable to AI than to humans.”

This tweet by reflects how folks are finally realizing that AI is becoming data-centric. But what does the future hold?

What data curation and modeling work will be done manually vs automated?

At Cleanlab, we believe significant automation is necessary to ensure the health of ever-increasing amounts of data and models. Our newest article outlines a vision of how automation-aided AI workflows should look (Hint: AI itself can facilitate many of steps needed to turn raw data into reliable model deployments)

👉 https://cleanlab.ai/blog/data-centric-ai/

The article outlines: how we plan to get there, why we are building open-source and AI platform software, and key differences between these offerings. Read it to learn why data itself should now be improved using AI.

07/12/2023

🎉New Blog: Assessing the Quality of Synthetic Data with Cleanlab Studio

Many folks using LLMs to generate data nowadays, but how do you know which synthetic data is good?

I'm excited to share introduce Cleanlab for synthetic data quality assessment! Without writing ANY code, you can quickly identify which synthetic data is unrealistic (ie. low-quality) and which real data is underrepresented in the synthetic samples. This tool works seamlessly across synthetic text, image, and tabular datasets.

Our blogpost demonstrates how to automatically detect issues in synthetic customer reviews data generated from the Gretel.ai LLM synthetic data generator.
https://cleanlab.ai/blog/studio-synthetic-data/

Use AI to measure the quality of LLM-generated data, automatically detecting unrealistic synthetic examples and underrepresented tails of the real data distribution.

07/06/2023

🗣 New Blog Alert: Enhancing Product Analytics and E-commerce with Cleanlab Studio

Say goodbye to data inconsistencies and hello to accurate product listings and analytics! With Cleanlab Studio, you can automatically identify miscategorized products and errors in massive catalogs.

Having accurate product listings is vital for E-commerce business for various reasons, such as obtaining accurate analytics, enhancing customer experience, improving product discoverability, and maximizing the effectiveness of SEO and advertising campaigns. Cleanlab Studio can automatically rectify errors in your E-commerce product catalog, enabling your business to succeed. The best part is that you can achieve this without the need for coding.

Read the full blog post to learn how it can revolutionize your E-commerce game.
https://cleanlab.ai/blog/studio-ecommerce/

Sign up for Cleanlab Studio and experience the power of clean data!

Using Cleanlab Studio to analyze product listings for errors, and how this can enhance product categorization and analytics efforts.

06/29/2023

📚 New Blog Alert: "Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5"

📌 Summary:
Reliable model evaluation is crucial in MLops and LLMops for making important decisions such as model selection and prompt deployment. However, in a recent study using the FLAN-T5 LLM from Google Research, we discovered that the prompts with the highest observed test accuracy were actually worse than other prompt candidates. The root cause? Unreliable annotations in the test data. This highlights the need to clean your test data to ensure optimal choices for prompt selection.

💡 Key Findings:
1️⃣ Noisy labels in test data make test accuracy less reliable for comparing models or prompt choices.
2️⃣ We evaluated the FLAN-T5 LLM using different types of prompts and observed a discrepancy between the prompts' observed test accuracy and their clean test accuracy.
3️⃣ Statistical analysis confirmed the significant difference in clean test accuracy, indicating the importance of careful audit and verification of test data.
4️⃣ We showcased the impact of unreliable data on prompt selection for various prompt types, highlighting the need for data quality assessment.

⚡ Don't let unreliable data misguide your model evaluation! Read the blog to see how you can make informed decisions for LLM prompt selection.

You may choose suboptimal prompts for your LLM (or make other suboptimal choices via model evaluation) unless you clean your test data.

06/27/2023

OpenAI vs Data-Centric AI: which produces better models for predicting legal outcomes from court documents?

Large Language Models from OpenAI and other providers like Cohere, Harvey, and Hugging Face are advancing what can be predicted from text data in court cases. Like most real-world datasets, legal document collections contain issues that can be addressed to improve the accuracy of any model trained on that data. This article shows that data problems limit the reliability of even the most cutting-edge LLMs for predicting legal judgments from court case descriptions. https://cleanlab.ai/blog/studio-model-deployment-legal/

Finding and fixing these data issues is tedious, but we demonstrate an automated solution to refine the data using AI. Using this solution to algorithmically increase the quality of training data from court cases produces a 14% error reduction on model predictions without changing the type of model used! This data-centric AI approach works for any ML model and enables simple types of models to significantly outperform the most sophisticated fine-tuned OpenAI LLM in this legal judgment prediction task.

Simply put: feeding your models healthy data is more important than what particular type of model you choose to use!

A legal sector case study using Cleanlab Studio to produce better models for making predictions (eg. of final judgements) based on court case documents.

Cleanlab

08/08/2024

11/02/2023

10/05/2023

08/09/2023

08/03/2023

07/31/2023

07/12/2023

07/06/2023

06/29/2023

06/27/2023

Address

Website

Alerts

Shortcuts

Share

Category