Artificial Intelligence in Engineering: Generating Parametric CATIA Models from Static STEP Files
The exchange of data via the neutral STEP format, which is common in modern product development, poses a significant challenge for engineers: imported models typically arrive without their original design history. Since parameters and logical relationships are lost during the transfer, modifying such components often requires time-consuming manual remodeling. The objective of the thesis was to replace this manual effort with an AI-supported automation process.
A key factor in the success of generative AI is the quality of the training data. As a first step, CAD models of geometric primitives were analyzed and converted into a format that can be interpreted by language models. This constituted the basis for the subsequent tokenization process, during which the data were decomposed into processable units of letters, numbers, and special characters using the tokenizer of the LLaMA 3.1 8B Instruct model.
To adapt the pre-trained model specifically to the syntax of CATIA macros (VBA) and the structure of STEP files, targeted fine-tuning was performed. Since training large language models is computationally expensive, the thesis leveraged advanced optimization techniques.
On the one hand, the Low-Rank Adaptation (LoRA) method was employed. This technique enables efficient task-specific adaptation by drastically reducing the number of trainable parameters without overwriting the model’s original knowledge. On the other hand, DeepSpeed—a library for optimizing deep learning training—was used to further improve memory and computational efficiency.
Through this combination of structured data preparation and resource-efficient training, the model was successfully trained to interpret the geometric relationships contained in STEP files. The system is capable of recognizing patterns based on the trained data and generating corresponding program code. The thesis thus demonstrates that large language models are, in principle, well suited to bridging the gap between static exchange formats and parametric design, opening up new perspectives for automation in engineering.
