Language Models
Modern language models enable secure, real-time AI on local devices for various industries like automative, robotics or healthcare. The goal of DSgenAI is to build a complete pipeline spanning specialized data curation, base model pre-training, and task-oriented optimization that delivers lean, domain-specific models for industry-relevant tasks and demonstrates their value in deployable prototypes.
For further information you can contact Sebastian Scharrer at Fraunhofer IIS.

Publications
- Joint Workshop on Legal and Ethical Issues in Human Language Technologies and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (LEGAL2026 and CALD-pseudo 2026) @ LREC 2026
- A Taxonomy of Safety: Harmonizing LLM Benchmarks in a Fragmented Landscape
- From Understanding to Generation: An Efficient Shortcut for Evaluating Language Models
- Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy