Faust-1: A German-First Language Model That Runs on Your Laptop
Introducing Faust-1, a 1.6B parameter language model trained from scratch for German. Optimized for local deployment on consumer hardware — no cloud, no data-center GPUs required.
Insights, updates, and thoughts on AI, machine learning, and data automation.
Introducing Faust-1, a 1.6B parameter language model trained from scratch for German. Optimized for local deployment on consumer hardware — no cloud, no data-center GPUs required.
We introduce YapBench, a benchmark for measuring how much LLMs over-explain simple questions. Our evaluation of 76 models reveals an order-of-magnitude spread in verbosity, with newer models trending longer.
Detect and redact 42 types of personal data across all 24 EU languages with 97% accuracy. Run on-premise - no data leaves your infrastructure. GDPR Article 17 compliant. Alternative to cloud PII APIs.
Learn how our open-source GReaT framework uses transformer language models to generate high-quality synthetic tabular data. Published at ICLR 2023, with 140,000+ downloads and adopted on Google's Kaggle platform.